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Abstract 


When  computing  eigenvalues  of  symmetric  matrices  and  singular  values  of  general  matrices 
in  finite  precision  arithmetic  we  in  general  only  expect  to  compute  them  with  an  error  bound 
proportional  to  the  product  of  machine  precision  and  the  norm  of  the  matrix.  In  particular, 
we  do  not  expect  to  compute  tiny  eigenvalues  and  singular  values  to  high  relative  accuracy. 
There  are  some  important  classes  of  matrices  where  we  can  do  much  better,  including  bidiag- 
onal  matrices,  scaled  diagonally  dominant  matrices,  and  scaled  diagonally  dominant  definite 
pencils.  These  classes  include  many  graded  matrices,  and  all  symmetric  positive  definite 
matrices  which  can  be  consistently  ordered  (and  thus  all  symmetric  positive  definite  tridiago- 
nal  matrices).  In  particular,  the  singular  values  and  eigenvalues  are  determined  to  high  rela- 
tive precision  independent  of  their  magnitudes,  and  there  are  algorithms  to  compute  them 
this  accurately.  The  eigenvectors  are  also  determined  more  accurately  than  for  general 
matrices,  and  may  be  computed  more  accurately  as  well.  This  work  extends  results  of  Kahan 
and  Demmel  for  bidiagonal  and  tridiagonal  matrices. 
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1.  Introduction 

When  computing  the  eigenvalues  of  symmetric  matrices  and  singular  values  of  general 
matrices  in  finite  precision  arithmetic  one  generally  only  expects  to  compute  them  with  an 
error  bound /(n)e  1|A  ||,  where /(n)  is  a  modestly  growing  function  of  the  matrix  dimension 
n,  £  is  the  machine  precision,  and  ||A  ||  is  the  2-norm  of  the  matrix  A.  This  follows  as  a 
result  of  standard  theorems  which  state: 

(1.1)  A  perturbation  hA  in  the  matrix  A  cannot  change  its  eigenvalues  (singular  values)  by 
more  than  i|5A  ||  [12]. 

(1.2)  The  standard  algorithm  for  computing  eigenvalues  (singular  values)  of  A  computes  the 
exact  eigenvalues  (singular  values)  of  A+hA,  ||5A  ||  ^  f  in)e\\A\\,  where  f  (n)  is  a 
modestly  growing  function  of  n  and  e  is  the  machine  precision  [12]. 

These  error  bounds  imply  that  tiny  eigenvalues  and  singular  values  (tiny  compared  to 
\\A  II)    cannot    generally    be    computed    to    high    relative    accuracy,    since    the    error    bound 
/(n)e  IJA  II  may  be  much  larger  than  the  desired  quantity.  In  fact,  if  each  matrix  entry  is  unc- 
ertain in  its  least  significant  digits,  the  tiny  eigenvalues  and  singular  values  may  not  even  be 
determined  accurately  by  the  data. 

Sometimes,  however,  the  eigenvalues  and  singular  values  are  determined  much  more 
accurately  than  error  bounds  like  /(n)€||A||  would  indicate.  This  was  shown  for  singular 
values  of  bidiagonal  matrices  in  [9],  where  it  was  proven  that  small  relative  perturbations  in 
the  bidiagonal  entries  only  cause  small  relative  perturbations  in  the  singular  values,  indepen- 
dent of  their  magnitudes.  It  was  also  shown  how  to  compute  all  the  singular  values  to  high 
relative  accuracy.  In  this  paper  we  extend  these  results  to  eigenvalues  of  symmetric  scaled 
diagonally  dominant  matrices  and  scaled  diagonally  dominant  definite  pencils.  (Henceforth 
we  will  abbreviate  "scaled  diagonally  dominant"  by  s.d.d.)  A  symmetric  s.d.d.  matrix  is  any 
matrix  of  the  form  AAA,  where  A  is  symmetric  and  diagonally  dominant  in  the  usual  sense, 
and  A  is  an  arbitrary  nonsingular  diagonal  matrix.  A  pencil  H  —  kM  is  s.d.d.  definite  if  H 
and  M  are  symmetric  s.d.d.  and  M  is  positive  definite.  Examples  of  s.d.d.  matrices  are  the 
"graded"  matrices 
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Note  that  Aq  is  graded  in  the  usual  sense,  but  not  diagonally  dominant  in  the  usual  sense.  A] 
is  neither  diagonally  dominant  in  the  usual  sense,  nor  graded  in  the  usual  sense,  since  the 
diagonal  entries  are  positive  and  negative,  and  not  sorted.  Thus  we  see  that  the  usual  diago- 
nal dominance  implies  being  s.d.d.,  but  not  the  converse.  In  fact,  the  set  of  s.d.d.  matrices 
includes  all  symmetric  positive  definite  matrices  which  can  be  consistently  ordered,  a  class 
which  includes  all  symmetric  positive  definite  tridiagonal  matrices.  Dense  matrices  may  be 
s.d.d.  as  well. 

Another  example  arises  from  modeling  a  series  of  masses  m],  .  .  .  ,m„  on  a  line  con- 
nected by  simple,  linear  springs  with  spring  constants  ko,  .  .  .  ,k„  (the  ends  of  the  extreme 
springs  are  fixed).  The  natural  frequencies  of  vibration  of  this  system  are  the  square  roots 
of  the  eigenvalues  of  the  s.d.d.  definite  pencil  H-kM,  where  M  is  the  diagonal  mass  matrix 
diag(m,,  .  .  .  ,m„)  and  H  is  the  tridiagonal  stiffness  matrix  with  diagonal  k^  +  k^,  ii^/:;  ,..., 
k„-i^k„  and  offdiagonal  -k^  ,...,  -^„_i.  Note  that  the  matrix  ;V/~''-/fA/~' -,  which  has  the 
same  eigenvalues  as  H  —  kM,  is  symmetric  s.d.d. 

In  particular,  we  will  show  that  small  relative  perturbations  in  the  entries  of  an  s.d.d. 
matrix    only    cause    small    relative    perturbations    in    the    eigenvalues    and    singular   values, 


independent  of  their  magnitudes.  This  is  a  much  tighter  perturbation  bound  than  the  classical 
one  provided  by  (1.1)  above.  Our  proof  of  this  result  generalizes  and  unifies  results  in  [9] 
for  bidiagonal  matrices  alone  and  in  [14]  for  symmetric  tridiagonal  s.d.d.  matrices  alone. 

Given  that  the  matrix  entries  determine  all  eigenvalues  or  singular  values  to  high  rela- 
tive accuracy,  one  would  naturally  like  to  compute  them  that  accurately  as  well.  We  present 
algorithms  based  on  bisection  which  attain  this  accuracy;  in  the  case  of  bidiagonal  or  sym- 
metric positive  definite  tridiagonal  matrices  QR  iteration  (suitably  modified)  can  be  shown  to 
attain  high  accuracy  as  well.  It  is  not  yet  known  whether  algorithms  based  on  divide  and  con- 
quer [5,  11,  13]  can  be  made  to  work  in  some  of  these  situations  too. 

One  may  also  ask  if  the  singular  vectors  and  eigenvectors  of  s.d.d.  matrices  and  pencils 
are  determined  any  more  accurately  than  for  general  matrices.  To  state  the  standard  pertur- 
bation bound  for  eigenvectors  of  symmetric  matrices  and  singular  vectors  of  general  matrices, 
we    need     to     define     the    gap:    if     X,     is     an     eigenvalue     (singular    value)     of    A     then 

gap(\,)  —  min  |X,— Xy|.     In   other  words,   it  is   the   absolute   distance   between   X,    and   the 

J"" 
remainder  of  the  spectrum. 

(1.3)  Let  y  be  a  unit  eigenvector  of  A  -(-5A,  a  =  y^Ay  the  Rayleigh  quotient,  X,  the  eigen- 
value of  A  closest  to  a,  and  z,  its  unit  eigenvector.  Let  Q(z,,y)  be  the  acute  angle 
between  y  and  z,.  Then  sin  e(y,z,)  s  4||SA  \\/gap{X,)  [17,  p.  222]. 

In  other  words,  the  error  as  measured  by  the  angle  is  proportional  to  the  reciprocal  of  the 
gap;  if  the  gap  is  small  (X,  is  in  a  cluster  of  eigenvalues),  the  corresponding  eigenvector  is 
poorly  determined.  As  before,  the  standard  algorithms  guarantee  ||8A  ||  :£ /(n)e  ||A  ||,  so 
eigenvectors  of  eigenvalues  poorly  separated  with  respect  to  ||i4  ||  (i.e.  ||A  \\/gap{\,)  is  large) 
will  generally  not  be  computed  accurately.  Analogous  results  hold  for  singular  vectors  of 
general  matrices. 

For  eigenvectors  of  s.d.d.  matrices,  a  stronger  perturbation  theorem  is  true.  Briefly,  we 
can  replace  the  gap  in  (1.3)  with  the  relative  gap,  min  |X,-X,  |/ |X,X,  |"-.  Thus,  as  long  as  X, 

is  relatively  well  separated  from  its  neighbors,  its  corresponding  eigenvector  is  determined  to 
high  relative  accuracy.  This  is  a  much  stronger  result  than  (1.3),  as  the  following  example 
shows.  Suppose  the  eigenvalues  are  1,  2-10~'°  and  10~'°.  Then  the  gap  for  the  smallest 
eigenvalue  is  ^ap  (10~'°)=  10~'°,  but  the  relative  gap  is  .707.  Thus  (1.3)  predicts  a  loss  of  10 
decimal  digits,  whereas  the  finer  analysis  predicts  nearly  full  accuracy. 

We  also  show  that  a  suitable  variation  of  inverse  iteration  can  be  used  to  compute  the 
eigenvectors  to  this  accuracy.  We  conjecture  that  other  methods  based  on  divide  and  conquer 
can  attain  this  accuracy  as  well,  but  this  has  not  been  proven. 

Similar  results  can  be  proven  for  singular  vectors  of  bidiagonal  matrices  and  eigenvec- 
tors of  s.d.d.  definite  pencils;  the  result  for  singular  vectors  partially  settles  an  open  question 
in  [9]. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  contains  definitions.  Section  3 
discusses  some  simple  generalizations  of  Gershgorin's  theorem  applicable  to  s.d.d.  matrices. 
Section  4  uses  the  minimax  characterization  of  eigenvalues  to  present  simple  perturbation 
lemmas.  In  section  5  this  lemma  is  applied  to  singular  values  and  in  section  6  to  eigenvalues. 
Section  7  discusses  perturbation  theory  for  eigenvectors  and  singular  vectors.  Section  8  shows 
that  the  condition  numbers  for  the  eigenvectors  provide  good  estimates  for  the  reciprocal  of 
the  distance  to  the  nearest  matrix  with  multiple  eigenvalues.  Section  9  discusses  algorithms 
for  the  bidiagonal  singular  value  decomposition,  section  10  discusses  algorithms  for  the  sym- 
metric tridiagonal  eigenproblem,  and  section  11  discusses  algorithms  for  the  dense  symmetric 
eigenproblem  (both  matrices  and  pencils).  The  new  algorithm  for  the  symmetric  positive 
definite  tridiagonal  eigenproblem  will  be  included  in  the  LAPACK  linear  algebra  library  [8]. 
Section  12  applies  our  results  to  a  matrix  arising  from  a  differential  operator.    Section  13 


summarizes  the  available  algorithms  and  the  current  state  of  research,  and  discusses  future 
work. 


2.  Definitions  and  Basic  Lemmas 

In  this  paper  we  will  deal  exclusively  with  real  (usually  symmetric)  matrices.  Extensions 
to  complex  (usually  Hermitian)  matrices  will  be  obvious.  |1- 1|  will  denote  the  2-norm. 

Decompose  the  matrix  A  as  A  =D  +A'  where  D  is  diagonal  and  A^  has  a  zero  diagonal. 
We  will  call  a  matrix  A  -^-diagonally  dominant  with  respect  to  a  norm  \\\-\\\  if 
\\\N  III  £  -y  min  |D„  |,  where  Q^i<\.  Suppose  that  A  is  7-diagonally  dominant  with  respect  to 

either  the  1-norm  or  infinity-norm.  Then  the  well  known  Gershgorin's  Theorem  says  that  the 
eigenvalues  of  A  lie  in  the  union  of  the  Gershgorin  disks  B,,  where  fl,  is  centered  at  D„  and 
has  radius  at  most  7  |D„|.  In  particular,  if  some  S,  is  disjoint  from  the  other  disks,  it  contains 
exactly  one  eigenvalue  and  D„  is  an  approximation  to  this  eigenvalue  with  relative  error  at 
most  -y. 

Now  let  A  =D  +A^  and  |D„  |  =  1  i.e.  A  has  ±  I's  on  the  diagonal.  Let  A,  and  A2  be  arbi- 
trary nonsingular  diagonal  matrices.  Then  we  call  H  =  A1AA2  y-scaled  diagonally  dominant 
(•i-s.d.d.)  with  respect  to  a  norm  |||-|||  if  A  is  7-diagonally  dominant  with  respect  to  ||H||.  If 
H  is  symmetric,  we  insist  that  A  be  symmetric  as  well  in  which  case  Ai  =  A2  can  be  chosen  in 


only  one  way:  A,„=  |//„ 


11/2 


Note  that  a  matrix  may  only  be  diagonally  dominant  in  the 


scaled  sense,  as  the  following  example  shows: 


A  = 


1 


Ai  =  A->  = 


1      0 
0    100 


H  =  A,AA-,  = 


1        10 
10    10000 


Here,  A  is  -y-diagonally  dominant  with  7=.l  (with  respect  to  the  1-norm,  2-norm  or  infinity- 
norm),  H  is  -y-s.d.d.  with  the  same  -y,  but  H  is  not  diagonally  dominant  in  the  nonscaled 
sense  for  any  "y<l. 

Our  definition  of  scaling  implies  nothing  about  the  monotonicity  of  the  diagonal  entries 
of  H\  for  example  H  is  -y-s.d.d.  if 


A   = 


1 

.1 

.1 

.1 

-1 

.1 

.1 

.1 

1 

A,  =  A-,  = 


1 

0 

0 

0 

100 

0 

0 

0 

.01 

H  =  AiAA-,  = 


1  10  .001 

10     -10000       .1 
.001         .1         .0001 


If  H  is  symmetric  with  positive  diagonal  entries,  being  -y-s.d.d.  with  respect  to  the  2- 
norm  is  closely  related  to  another  well  known  property:  consistent  ordering  [18].  Consistent 
ordering  is  defined  as  follows:  Let  A  =1  +N  =1  +  L^U,  where  L  is  strictly  lower  triangular 
and  U  is  strictly  upper  triangular.  Then  A  is  consistently  ordered  if  the  eigenvalues  of 
aL-*-a~^f/  are  independent  of  a^^O.  Now  suppose  there  is  a  permutation  matrix  P  such  that  A 
in  P^HP  =  L^Ai^  =  i^{I  ~N)i^  is  consistently  ordered.  Then  we  claim  H  is  positive  definite  if 
and  only  if  it  is  s.d.d.  To  prove  this,  note  that  by  choosing  a  =  l  and  a=-l,  we  see  the 
eigenvalues  oi  N  =  L  ^  U  occur  in  ±  pairs,  including  =  ||A^  ||  =  - "y.  Now  note  that  H  is  posi- 
tive definite  if  and  only  if  I^N  is  positive  definite  (by  Sylvester's  theorem),  and  that  the 
smallest  eigenvalue  ol  I  ^ N  is  1-  UN  ||  =  1-  7.  Therefore,  the  theorems  in  this  paper  apply  to 
many  matrices  arising  from  discretized  differential  equations  [18];  see  section  12  for  an  exam- 
pie. 

We  will  call  a  symmetric  pencil  H -\M  y-scaled  diagonal  dominant  definite  (y-s.d.d. 
definite)  with  respect  to  a  norm  |||-  |||  if  H  and  M  are  7-s.d.d.  symmetric  with  respect  to  |||-  ||| 
and  M  is  positive  definite.  If  H  is  positive  definite  as  well,  we  call  H  —\M  y-s.d.d.  positive 
definite. 


If  T  is  any  symmetric  matrix,    ||r||  =  max  |X.,(r)|  ^   |||7"|||  for  any  operator  norm  or 

the  Frobenius  norm  1||-|||-  Therefore,  all  the  theorems  in  this  paper  which  are  proven  for 
diagonal  dominance  with  respect  to  the  2-norm  automatically  hold  for  diagonal  dominance 
with  respect  to  any  operator  norm  or  the  Frobenius  norm. 

The  minimax  characterization  of  the  eigenvalues   X;^  •  •  •  ^X„     of  a  definite  pencil 
H-\M  {H  =H^,  M  =M^,  M  positive  definite)  is  [17] 

x^Hx 
X.,  =  mm  max   — :;: .  i")  i\ 

lkil=i 
where  S'  varies  over  all  I'-dimensional  subspaces  of  R"  and  x  varies  over  all  unit  vectors  in  S' 
{x  could  vary  over  all  nonzero  vectors,  but  we  will  find  it  convenient  to  restrict  to  unit  vec- 
tors).   There  is  an  obvious  simplification  if  M  is  the  identity  matrix  (the  standard  eigenprob- 
lem). 


3.  Generalizations  of  Gershgorin's  Theorem 

It  turns  out  the  eigenvalues  of  the  s.d.d.  matrix  ^  =  AiAA2   He  in  Gershgorin  circles 
whose  centers  and  radii  are  both  scaled  by  AjA;: 

Proposition  1:  Let  H  =  A,AA2  fA„=±l)  be  a  (possibly  nonsymmetric)  "i-s.d.d.  matrix  with 
respect  to  the  infinity-norm  or  1-norm,  with  7<1.  Then  the  eigenvalues  of  H  lie  in  disks  B,, 
where  B,  is  centered  at  H,,  and  has  radius  at  most  y  [H,,  |.  If  B  ,  is  disjoint  from  the  other  disks,  it 
contains  exactly  one  eigenvalue  and  H„  is  an  approximation  to  that  eigenvalue  with  relative  error 
at  most  7  . 

Proof:  Suppose  without  loss  of  generality  that  H  is  -y-s.d.d.  with  respect  to  the  infinity-norm; 
otherwise  consider  H^ .  The  scalar  X  is  an  eigenvalue  of  H  if  and  only  \i  H  -\I  is  singular, 
which  is  in  turn  true  if  and  only  if  A-XAf'A^'  is  singular.  Let  x  be  a  right  null  vector  of 
j4  — XAf'A^',  and  suppose  Xj  has  absolute  value  at  least  as  large  as  any  other  component  of 
X.    Then  we  may  rearrange  the  equation 

-1    A-l      ^ 


2  Aj^x^  -  Xx^AijjAijj  =  0 

k 


to  obtain 


Xt 

^  =  ^\.ij^2.jj  i^ij  +   2  Ajk  — )    . 

Since  Ajjii.ijjC^i.jj  =  f^jj  ^nd  |  2  A^*  — |  s  7,  X  must  lie  in  a  ball  of  radius  7  |//„|  centered  at 

H„.  The  usual  Gershgorin  argument  shows  that  if  this  disk  is  isolated,  it  contains  exactly  one 
eigenvalue.  □ 

This  theorem  implies  that  at  least  if  a  Gershgorin  disk  is  isolated  so  that  small  relative 
changes  in  the  matrix  entries  do  not  effect  its  isolation,  then  the  eigenvalue  it  contains  cannot 
change  by  a  factor  of  more  than  (l-)-7)/(l  — 7).  If  we  assume  H  is  symmetric,  we  need  not 
assume  the  disks  are  isolated  to  obtain  this  result: 

Proposition  2:  Let  H  be  a  -y-s.d.d.  symmetric  matrix  with  respect  to  the  2-norm.  Let  h,  be  its 
diagonal  entries  in  increasing  order  h^^  ■  ■  ■  ^h„,  and  X/  its  eigenvalues,  also  in  increasing 
order.  Then 

>^, 
1-7  S   —  <.  1-1-7    . 


Proof:  Assume  without  loss  of  generality  that  H„  =  h,,  by  reordering  the  rows  and  columns  of 
H  if  necessary.  Then  by  (2.1) 

X,  =  min   max  x^Hx  s    max  x^Hx  =    max  x  H^'^x 


x«S' 

llxll  =  l 


X  =1 


1x11  =  1 


where  Sq  is  the  space  spanned  by  the  first  i  standard  basis  vectors,  and  //'''  is  the  leading 
principal  ;  by  i  submatrix  of  H.  If  h,<0  then  X,^(l--y)/i,  by  simple  norm  inequalities.  If 
/i,>0  and  all  ;i,>0  for  j^i,  simple  norm  inequalities  again  imply  X,:£(l  +  -/)/i,.  If  /i,>0  and 
some  hj<0  for  j<i,  we  also  have  X,<(l  +  -/)/i,  but  we  must  argue  as  follows: 


X  H 


(o; 


=    [Xl     ,   X;] 


Ui 

0  1 

1' 

-/,    0 

\ 

\^\ 

n  ] 

X\ 

+  N^'^ 

u 

A. 

• 

0     Ik 

' 

0 

A; 

Xi 

-|lA,i: 


P  +    ||A,^|I-  +  7(l|A,i,l|'  -    l|A,x,|r-) 


s  (1  +  7)I|A:X2|P  s  (l+-y)/',    • 

Applying  the  same  process  to  - H  yields  the  complementary  inequalities  X,s(l-7)/i,  for 
/i,<0  and  X,a(l--/)/i,  for /i,>0.    D 

Finally,  we  may  extend  the  result  to  s.d.d.  symmetric  definite  pencils: 

Proposition  3:  Let  //  — XA/  be  a  symmetric  y-s.d.d.  definite  pencil  with  respect  to  the  2-norm. 
Let  r,  be  the  sorted  ratios  of  diagonal  entries  H^JM^^,  where  r^^  •  •  ■  <r„,  and  X,  the  eigen- 
values, also  in  increasing  order.  Then 

1  —  7  ^[  l-f-v 

l-t-7  r,  1-7 


Proof:  Assume  as  in  the  proof  of  Proposition  2  that  r,  =  H„/M,i,  by  reordering  rows  and 
columns  if  necessary.  Write  Af  =  AAA  where  A  is  diagonal  and  A  is  diagonally  dominant  with 
ones  on  its  diagonal.  Then  the  pencil  A~'/fA~'  — XA  =  R  —  \A  has  the  same  eigenvalues  as 
H  —  \M,  but  now  the  diagonal  entries  R„  =  r,.  Then 


X,  =  min  max 
s'       x«S' 

l|xit  =  l 


x'^'Ax 


max 

x€Sb 

11x11=1 


x'^Rx 


max 


fp^'^: 


=  1    X 


A^'^i 


where  So  is  the  space  spanned  by  the  first  ;  standard  basis  vectors,  and  R^'^  and  A''^  are  the 
leading  principal  1  by  1  submatrices  of  R  and  A,  respectively.  Note  that 
1--/  s  X  A^'^x  s  l-i--y  for  all  unit  vectors  x,  since  A  equals  the  identity  matrix  plus  a  matrix 
of  norm  at  most  -y.  Then  by  Proposition  2,  we  have  X,  s  {l  +  -^)l{l-'i)r,  if  r,>0  and 
X,  <  (l-7)/(1^7)r,  if  r,<0.  Applying  the  same  process  to  7?  *  XA  yields  the  other  inequali- 
ties. D 


4.  Perturbation  Lemmas  Based  on  the  Minimax  Theorem 

Let  X)(H,Af)s  •  •  •  sX„(//,M)  denote  the  eigenvalues  of  the  definite  pencil  H-\M. 
Given  the  minimax  characterization  in  (2.1),  the  following  lemma  is  simple  to  prove: 

Lemma  1:  Suppose  hH  has  the  property  that  for  all  nonzero  x 

x'^(H  +  ?>H)x   ^ 
"^  x^Hx  "^"    ' 


where  Q<gi^gu-  Then 


\iiH  +  bH,M)    ^ 
^'  ~         X,(W,Af)         ~  '^" 


and  similarly 


for  all  i.  In  other  words,  if  the  Rayleigh  quotients  x^(H  +  BH )x  and  x^Hx  differ  by  at  most  a  cer- 
tain factor  for  all  x,  then  the  eigenvalues  of  H  +  hH  -  \M  and  H  -  XM  differ  by  at  most  that  same 
factor. 

Proof:  Let  X,  =  X,(ff,M)  and  Let  X',  =  \,iH  +  bH,M)-  We  consider  only  X,>0;  the  case 
X,<0  is  analogous.    Let  the  spaces  Sq  and  S[  satisfy 

X,=  max  x^Wa:/x^Mx     and     k' ,=  max  x^(H+bH)x/x^Mx    . 

Then 

x^(H  +  hH)x  x^(H  +  ?,H)x    x^Hx    ^       , 

X',  =  min  max       ^"  '     ^  max  — ^-^r-; '-  -jrr  -  Su>-, 

s'     xis'  x^Mx  xish  X  fix  x'Mx 

x^Hx    ^  x^Hx  x^{H  +  hH)x   ^        ,    , 

X,  =  mm  max  —z ^  max  ^r-- —  j  ^  gi    A.  , 

s'     xiS'    x^Mx         xis\   x^{H  +  hH)x  x'Mx 

completing  the  proof.  D 

Lemma  1  can  also  be  generalized  to  infinite  dimensional  operators  [15,  Thm  VL3.9]. 

There  is  an  obvious  analogous  result  if  if  both  H  and  M  are  perturbed  simultaneously: 

Lemma  2:  Suppose  hH  and  bM  have  the  property  that  for  all  nonzero  x 

x^(H  +  bH)x  ,  ,    x'^(M^hM)x   _ 

giH  ^  ^TT; ^  =^  8uH     and     gi^  s  — y- s  g^,,,    , 

x'Hx  x'Mx 

where  0<giH'^guH  'J"^  ^<glM^guM-  Then 

g,H    ^    X,(H  +  hH,M  +  hM)    ^    guH_ 

8uM  \,{H,M)  glM 

for  all  i. 

Lemma  3:  Let  H  be  symmetric  y-s.d.d.  with  respect  to  the  2-norm.  Write  H  =  AAA  where  A  is 
diagonal  and  A  has  ~l's  on  its  diagonal.  Let  X  be  an  eigenvalue  of  H,  y  the  corresponding  unit 
eigenvector,  and  x=  Ay  I  \\i^y\\  the  corresponding  unit  eigenvector  of  A  —\A.~' .  Then 

1-7  <    \x'''Ax\  <  1+7    . 

Proof:  Write  A  =E  +N,  where  E  is  diagonal  with  ±  I's  on  the  diagonal,  N  has  zero  diagonal, 
and    ||A'||£7<1.    The    upper    bound    on    |x^Ax  |    follows    immediately    from    taking    norms: 

|xMx  I  <  ||£||+  llA'll  <  l-t-7.  If  £=/,  then  |x^Ax  |  =  |l  +  x^A^x  |  s  I-7.  Assume  then 
without  loss  of  generality  that 


E  = 


h      0 


0    -h\ 

where  //  denotes  an  /-by-/  identity  matrix.  We  will  prove  the  theorem  only  for  X.<0;  for  the 
positive  X.  consider  -H.    Partition 


^1 

,     A  = 

A, 

0 

and 

A^  = 

'n\ 

x^ 

0 

A: 

.^2 

X    = 


conformally  with  E.  Then  Ax=  XA~*-x  may  be  rewritten 

Xi  +  N\x  =  XAf-Xj 

—  X2    +   N^X    =    kA2'X2 

Solving  the  first  equation  above  for  jt]  yields 

X,  =   (XAf-  -  iy^N\x    . 
Note  that  XAf*  -  /  is  diagonal  with  diagonal  entries  less  than  -1.    Now 

x^Ax  =  x'^Ex  +  x^Nx  =  j:[x,  -  xlx2  +  x'''Nx  =  2x[xi  -  1  +  x'^Nx 
since  xfxj  -I-  ^2X2  =  1-    Combining  the  last  two  displayed  equations,  and  using  the  fact  that 

-x^NiikAr  -  iy^N\x  2:  x'^NiiXAi^  -  /)"'A'[x  >  0 
yields 


x^Ax  =  2x^Afi(X.Af-  -  iy-N\x  +  (x[  ,  x[) 


A^jx 


-  1 


2x'"Ari(XAr-  -  ly-N^x  +  xInIx  +  x'^N^{\A^~  -  /)~'A^[x  -   1 


=  i^T 


xXnIx  +  x'^NiiXAi-  -  ly-Njx  -  1 


=  x^A^Tx  +  x[(\Af-  -  /)~'A'[x  -   1 


=   (x[(XAr-  -/)"',  x[)A'x  -  1 


(XAf-  -  /)-'x,' 

X2 

x-> 

\\-\\Nx\\  -   1 

A^x      -   1 


y  -  1 


In  the  next  two  sections  we  will  use  these  results  to  derive  perturbation  theorems  for 
eigenvalues  and  singular  values. 


5.  A  Perturbation  Theorem  for  Singular  Values. 

Using  Lemma  2,  we  prove  the  following  theorem,  which  is  a  slight  strengthening  of  a 
result  of  Kahan  [9]: 

Theorem  1:  Let  B  be  an  n  by  n  bidiagonal  matrix: 


B  = 


bn-l 


We  assume  the  fl,  and  b,  are  nonzero  since  otherwise  B  splits  into  independent  subproblems.  Let 
B  +  6S  be  a  perturbed  bidiagonal  matrix  with  entries  a,a,  in  place  of  a,  and  ^,b,  in  place  of  b,. 
Then  the  singular  values  cti^  ■  ■  ■  ^(j„  of  B  and  ct'i^  ■  ■  •  So-'„  satisfy 

gi^ ^  gu 


where  gi  and  gu  are  defined  as  follows.  Define  the  finite  set  S  of  positive  numbers  by 


5  =  {|' 


P; 


a-i  +  l  •  ■  ■  ttfc 


:  l^j^k^n-1}    U    {| 


ak 


3;    •    ■    •    Pit- 


:  l<;:S;t<n} 


Note    that   S    contains    |P^|,    l^j<n,    and    \a.j\,    1<_/Sn.     Let  min  5   and   max  S    denote    the 
minimum  and  maximum  entries  of  S,  respectively.  Then 


gu  =  max  S     and     gi  =  min  S    . 

Corollary   1:  Let  B  and  B+bB   be  bidiagonal  with  singular  values  CT](S)s  • 
o-i(S^5fi)S   •  ■  ■  :Scr,(fl  +  6fl)         respectively.  If       for        all        nonzero 

t"'s  |(fl +  55),/S,_,  I  £  T  for  some  7^1,  then 


■  Scr,(S)   and 
entries        B,i, 


1 


,:n-i 


CT,(S  +5S) 
«T,(fl) 


ln-\ 


Thus,  relative  perturbations  of  at  most  t  in  the  entries  of  B  cause  relative  perturbations  of  at 


most  T 


2n-l 


in  its  singular  values.    If  j=1  +  t\  is  close  to  1 ,  so  is  t 


ln-\. 


■\  +  {2n-\)t\. 


Corollary  2:  Let  B  and  B +hB  be  bidiagonal.  If  |(S  +  5fl),;|  =  t|B,^|  for  all  i  and  j,  then 
<J ,{B  +5fl)  =  Tcr|(S).  This  simply  says  that  if  you  multiply  each  entry  of  a  bidiagonal  matrix  by 
±T,  you  multiply  all  its  singular  values  by  t  as  well. 

Proof  of  Theorem  1:  For  notational  convenience  rename  the  entries  of  B  so  that 


B  = 


Si     S2 

^3     ^4 


In -I 


and  so  that  B  ^  BB  has  entries  -y,5,  in  place  of  s,.  We  may  assume  without  loss  of  generality 
that  ail  the  s,  are  real  and  positive,  since  this  may  be  achieved  by  pre-  and  postmultiplying  B 
by  unitary  diagonal  matrices.  We  may  also  assume  the  7,  are  real  and  positive  for  the  same 
reason.    We  also  use  the  well  known  fact  that  the  eigenvalues  of 


C  = 


0    B' 
B     0 


are   plus   and   minus   the   singular  values   of  B.    Furthermore,   by    reordering   the   rows   and 


columns  of  C  in  the  order  I,  n  ^l,  2,  n +2,  .  .  .  ,  n,  2n,  we  see  C  is  similar  to 

0    s, 
J,      • 


E  = 


Thus,  the  singular  values  of  B  are  the  absolute  values  of  the  eigenvalues  of  the  pencil 
E-kl.  We  are  interested  in  comparing  these  eigenvalues  with  the  eigenvalues  of  £-i-5£-X/, 
where  E  ~&E  has  entries  -y.i,  in  place  of  s,.  We  will  chose  a  diagonal  matrix  D  such  that 
D  {E  -fbE)D  =  E,  so  that  these  eigenvalues  will  be  the  eigenvalues  of  the  pencil  E-\D~;  we 
will  then  apply  Lemma  2  to  conclude  that  the  eigenvalues  will  change  at  most  by  a  factor  in 
the  range  miiLD,7"  to  maxDjJ". 

Actually,  we  will  find  two  diagonal  matrices  D^  and  D.  satisfying  D  (E  +bE)D  =E  but 
corresponding    to    different   choices   of   D  ^    in   order   to    minimize    max  D;„    and   maximize 

min  Duu'i  this  will  in  turn  maximize  gi  and  minimize  gu  where  (by  Lemma  2) 


gi  - 


max  D; 


1 


min  Du„ 


=    gu 


First  consider  D;.  We  want  to  choose  D/u  to  minimize  max  D/„.  As  above,  the  diagonal 

entries  of  D,  must  satisfy  the  following  recurrence:  D;,.,  ,+  i  =  (7,0;,,)"'.  This  means  the 
diagonal  entries  of  D;  are  either  of  the  form 

i\ii  ■  ■  ■  yzi-\ 


or 


Diu- 


1 


-izlA 


lij 


■Y:74  •  •  •  7:/ 


Dn\y\    7375  •  •  •  7:;-i 


(5.1) 


(5.2) 


Choosing  D/u  to  minimize  the  maximum  of  these  terms  is  equivalent  to  choosing  D/u  to 
mmimize  max(jiD;ii  ,  s^D^l),  where  s^  is  the  maximum  coefficient  of  D/u  in  (5.1)  and  ^2  is 
the  maximum  coefficient  of  Dfyl  in  (5.2);  this  minimum  is  easily  seen  to  be  (s\S^y^.  Some 
tedious  algebraic  manipulation  leads  to  the  expression  for  gi  in  the  statement  of  the  theorem. 
Choosing  D^w  to  maximize  min  Dm,  is  analogous.  D 

A  similar  proof  yields  the  following  analogous  result:  Let  A  be  an  arbitrary  matrix,  and 
D]  and  D2  arbitrary  nonsingular  diagonal  matrices,  which  we  may  assume  are  real  and  non- 
negative.  Then  the  singular  values  tr,  of  A  and  cr',  of  D  i/lD;  satisfy 

a', 
mm  D  1,-mm  D-.,  s  <  max  D  i,-max  D^j    . 
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6.  Perturbation  Theorems  for  Eigenvalues 

In  this  section  we  use  the  perturbation  lemma  of  section  3  to  prove  perturbation 
theorems  for  eigenvalues  of  symmetric  -/-s.d.d.  matrices  and  7-s.d.d.  positive  definite  pen- 
cils. Theorems  2  and  3  apply  only  to  positive  definite  matrices  and  pencils,  and  are  stronger 
than  Proposition  4  and  Theorem  4  which  apply  to  indefinite  matrices  as  well.  In  fact,  when 
the  matrix  or  pencil  is  positive  definite,  A  may  be  an  arbitrary  matrix,  not  just  diagonal.  Our 
results  for  definite  pencils  with  both  positive  and  negative  eigenvalues  are  rather  weaker  than 
the  results  for  s.d.d.  indefinite  symmetric  matrices. 

Theorem  2:  Let  ff  =  AAA  =  A(/ +A^)A  be  a  positive  definite  matrix  where  ||A^||  =  7<1,  and  A  is 
an  arbitrary  nonsingular  matrix.  (If  &i.  is  diagonal  this  is  equivalent  to  H  being  a  symmetric  posi- 
tive definite  y-s.d.d.  matrix  with  respect  to  the  2-norm.)  Let  bH  be  a  symmetric  perturbation  of 
H  with  ||A~'5//A~' II  =  -n  <  1-7.  Then  the  eigenvalues  XjS  •  •  •  ^X„  of  H  and 
X'i<   •  •  ■  ^\' „  ofH+hH  satisfy 

1-   ^3-^  A!i<  1  + 


1-7  X,  1-1 

Thus,  if  A  is  diagonal,  relative  perturbations  of  at  most  r\  in  the  entries  of  H  cause  only  relative 
perturbations  of  at  most  nr\/{l—  y)  in  the  eigenvalues. 

Proof:    We  have 

x^(H-rbH)x   ^    x^AM  ^A~'S//A~')Ax    ^    y^(A  ^  A"'S//A  "')y 


X 


Hx  x^AAAx  y^Ay 


(6.1) 


where  y  =  Ax.  The  values  taken  by  the  quantity  in  (6.1)  as  x  varies  over  all  nonzero  vectors 
are  the  same  as  the  values  taken  on  as  y  varies  over  all  unit  vectors.  Since  l--y£y^A>'  if  y  is  a 
unit  vector. 


\--JL 


y^(A  +A~'SHA"')y 


1 •^—  ^  ^^^ — --  ""- ^sl+^J—    . 

1-7  y' Ay  1-7 

The  result  now  follows  from  Lemma  1.  D 

Theorem  3:  Let  H  =  Af]AH^H=  Ah{1  +  Nh)Ah  and  M  =  Af,iA^fAM=  ^mU  +Nm)Am  be  positive 
definite  matrices  where  \\Nfi\\^y<l.  IJA^^t/ i|:S7<  1,  and  Aff  and  A^  are  arbitrary  nonsingular 
matrices.  (If  A  a  and  A^  are  diagonal  this  is  equivalent  to  H  —  \M  being  a  y-s.d.d.  positive 
definite  pencil  with  respect  to  the  2-norm.)  Let  bH  and  5Af  be  symmetric  perturbations  of  H  and 
M  such  that  ||A^'5//A^'||  ^  ti  <  I-7  and  ||A,w'5A/A.r/' ||  ^  ^i  <  I-7.  Then  the  eigen- 
values Xi^   •  ■  •  :S\„ofH-\Mand\'i:^  ■  ■  ■  <.\' „  of  (H  -f-hH)-\{M  +5A/)  satisfy 

l-y-t)    ^   _^  ^    1-7  +  ^1 

1— 7^T1  X',  1— 7  — T| 

Thus,  if  Aff  and  A  v/  are  diagonal,  relative  perturbations  of  at  most  -r)  in  the  entries  of  H  and  M 
cause  only  relative  perturbations  of  at  most  m\/(l-y)  in  the  eigenvalues  of  H  -  \M . 

Proof:  Apply  the  technique  in  the  proof  of  Theorem  2  to  both  H  and  M  and  apply  Lemma  2. 

D 
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When  H  is  indefinite,  our  results  are  weaker.  In  the  case  of  H  alone,  we  must  assume  A 
is  diagonal  to  attain  a  bound  like  that  of  Theorem  2: 

Proposition  4:  Let  H  be  an  n-by-n  symmetric  y-s.d.d.  matrix  with  respect  to  the  2-norm.  Thus 
H  =  AAA  where  A  is  diagonal,  A  =  E  +N  where  E  is  diagonal  with  ±  I's  on  the  diagonal,  N  has 
a  zero  diagonal  and  ||A^||^7<1.  Let  5//  be  a  symmetric  perturbation  of  H  with 
||A~'5WA~'||  =  T)  <  (1  — 7)/n.  Assume  also  that  H+bH  is  -i-s.d.d.  Then  the  eigenvalues 
\i<   ■  :  ■  <X„  ofH  and  X'i<   •  •  •  <X.'„  ofH  +  hH  satisfy 

1-7  X,  1-7 

Thus,  relative  perturbations  of  at  most  r\  in  the  entries  of  H  can  cause  relative  perturbations  of  at 
most  n"T|/(l  — 7)  in  the  eigenvalues. 

Proof:  We  will  prove  the  theorem  only  for  the  negative  eigenvalues;  for  the  positive  ones 
consider  —H.  We  cannot  apply  Lemma  1  directly  here  because  x^Ax/x^x  is  not  bounded 
away  from  0  for  all  x  as  in  the  proof  of  Proposition  4.  By  Lemma  3,  however,  it  is  so 
bounded  if  we  restrict  x  to  lie  in  an  eigenspace  of  A  — XA~'  corresponding  to  only  negative 
(or  only  positive)  eigenvalues;  this  will  be  sufficient  for  the  proof. 

To  this  end,  let  XjS  •  •  •  £X„  be  the  eigenvalues  and  xj,  .  .  .  ,x„  the  corresponding  unit 
right  eigenvectors  of  A  -XA~";  let  the  primed  quantities  X'l^  •  •  •  ^X'„  and  x',  correspond 
to  A  -rA"'5/fA"'-XA~-.    By  Lemma  3  xrAx,:£7-l<0  if  X,<0. 

Now  let  X,  =  [xi,  .  .  .  ,x,]  and  A,  =  diagCK^,  .  .  .  ,X,),  so  AX,  =  A~-X,A,.  Since  the 
columns  of  X,  are  eigenvectors  of  A— XA"-,  the  columns  of  A~'X,  are  eigenvectors  of  H  and 
so  are  orthogonal.  Thus 

XfAX,  =  XrA"-X,A,  =  diagixiA'-x.X,) 

is  diagonal  with  diagonal  entries  bounded  above  by  7— 1.  Let  z  be  an  arbitrary  unit  vector; 
then 

z^xJaX.z  =  2'^diag{xfA''x,\,)z  s  7-I    . 

Now  we  use  the  characterization 

x^Ax 


X,  =  min  max  —r 

s'     xis'    x^A    -X 

where  the  minimum  is  attained  for  S'  =  Sq  =  span(X,).  Then 

x^(A  4-A~'S//A~'')x  x^fA+A-'S/ZA-'U       x^Ax 


X',  =  min  max  — ^ :; ; '—  s  max 

S'       liS 


x^A    -X  zisb  x^ Ax  x^A' 


;^XrA~'S//A~'X,z        z^X\aX,z 

=    max   (1  +  p-y )'~r~f --, • 

..-1=1  z^x]AXa  z^XiA-'X,z 

Now   |r^xrA-'5//A-'X,2|  <  i-p  and  |z^xrAX,;|  >  I-7  so 

^',^(1-T^)X,       or       A^^i_J3_ 
1-7  X,  1-7 

Swapping  the  roles  of  A  and  A  -r  A  " ' S//A  " '  we  obtain 

1  - -i^  ^  4^  ^  (1  - -i^)-i 

1-7        X,  1-7 

as  desired.  D 


12 


The  factor  n  in  the  bound  of  Proposition  4  is  an  overestimate,  and  can  be  removed  by 
modifying  the  conditions  of  the  proposition  just  slightly: 

Theorem  4:  Let  H  =  AAA  be  an  n-by-n  symmetric  y-s.d.d.  matrix  with  respect  to  the  2-norm. 
Here  A  is  diagonal  and  A  has  ±l's  on  the  diagonal.  Let  bH  be  a  symmetric  perturbation  with 
||A~'8//A"' II  =  Ti-  Assume  that  H  +  ihH  is  y-s.d.d.  for  all  Os^<l.  Then  letting 
Xj^  •  •  ■  :SX.„  be  the  eigenvalues  of  H  and  k' ^^  ■  ■  ■  ^\' „  be  the  eigenvalues  ofH  +  hH,  we 


have 


exp(Y3^)  ^  ^  ^  ^^P(l^)    •  (6-2) 


Proof:  Let  E  =  A" '5HA~V  ||  A" '5//A~ '  ||  be  a  matrix  of  norm  1,  H  {Q  =  A(A+C£;)A,  and 
X.j(5)s  •  ■  ■  ■s.Xi^l)  be  the  eigenvalues  of  H{1).  Suppose  first  that  X,(0)  is  simple.  Let  x,  be 
the  unit  eigenvector  corresponding  to  X,(0).  Then  from  standard  eigenvalue  perturbation 
theorem  [15,  19],  we  know 

X.CO  =  >^,(0)  +  CxrAfAx,  +  0{i^) 

Therefore 

X,(0)  xfA/lAx,  yjAy, 

where  Hy,  ||  may  be  taken  to  be  one.  By  Lemma  3,  lyT^y,  1  S:  I-7,  and  so 

Assume  now  that  X,(0  is  simple  for  all  O^^s-ri;  then  (6.3)  implies  that 
\d  logX,(j:)/Jx|  :s  (l--y)~'.  Integrating  from  0  to  ti  yields  (6.2).  By  [11,  Theorem  n.6.1], 
the  eigenvalues  are  all  real  analytic,  even  when  they  are  multiple.  Thus,  if  there  are  only  fin- 
itely many  I,  where  X,($)  is  multiple,  we  can  apply  the  above  argument  in  the  intermediate 
intervals. 

It  remains  to  consider  the  case  where  X,(^)  is  multiple  for  infinitely  many  ?.  Here  we 
argue  that  this  can  only  happen  for  a  set  of  pairs  of  matrices  H  and  E  of  measure  zero,  that 
off  this  set  the  previous  argument  holds,  and  by  continuity  of  the  eigenvalues  the  same 
bounds  must  hold  on  the  set.  To  see  that  the  set  of  H  and  E  such  that  some  X,(5)  is  multiple 
infinitely  often  is  of  measure  zero,  consider  the  discriminant  of  the  characteristic  polynomial 
of  //-^A£A;  this  is  a  polynomial  in  t,  and  the  In'  entries  of  H  and  E.  H  {Q  can  have  multi- 
ple eigenvalues  if  and  only  if  this  discriminant  vanishes.  If  it  vanishes  for  infinitely  many  ^, 
its  coefficients  (viewing  it  as  a  polynomial  in  Q  must  vanish  identically.  These  coefficient  are 
in  turn  polynomials  in  the  entries  of  H  and  E,  and  so  vanish  only  on  a  proper  variety  (a  set  of 
measure  zero).  Off  this  set,  the  discriminant  has  at  most  a  finite  number  of  zeros  (bounded 
by  its  degree  as  a  polynomial  in  Q.    □ 

Result  (6.2)  was  claimed  without  proof  in  [14]  just  for  the  case  of  tridiagonal  matrices 
perturbed  on  their  offdiagonals. 

In  light  of  Proposition  3,  it  is  probably  possible  to  prove  an  analogous  theorem  for  -y- 
s.d.d.  definite  pencils  which  have  both  positive  and  negative  eigenvalues,  but  we  have  not 
been  able  to  do  so.  However,  the  following  technique  frequently  succeeds  in  reducing  the 
eigenproblem  for  such  pencils  to  a  problem  where  Theorem  4  may  be  applied: 
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Algorithm  1:  Reducing  a  y-s.d.d.  definite  pencil  H  —  kM    to  an  s.d.d.  matrix  Y: 
(!)     Let  Di  =  diag(A/,'/'),  and  compute  Hi  =  D~^HD''^  and  M,  =  D  ~'AfD  "'.  Now  Mj  has  unit 
diagonal  and  is  diagonally  dominant  in  the  usual  sense. 

(2)  Let  P  be  a  permutation  matrix  chosen  so  that  H2  =  PH \P^  has  its  diagonally  entries 
sorted  from  smallest  to  largest  in  absolute  value  (smallest  at  the  top  left,  largest  at  the 
bottom  right).  Let  M;  =  FM,/''". 

(3)  Let  L  be  the  lower  triangular  Cholesky  factor  of  Mt-  Let  Y  =  L~^H2L~'^.  Then  Y  and 
H  -  X.M  have  the  same  eigenvalues. 

This  is  a  variation  on  the  usual  reduction  of  a  definite  pencil  to  standard  form.  The 
point  is  that  if  M  is  sufficiently  diagonally  dominant,  L  will  also  be  diagonally  dominant  with 
nearly  unit  diagonal,  and  the  multiplication  L'^HiL'^  will  not  destroy  the  s.d.d.  property  of 
Hi.  Thus,  Y  will  be  s.d.d.  The  following  theorem  formalizes  this,  but  is  weak  in  that  it  only 
guarantees  diagonal  dominance  of  Y  for  rather  small  7,  much  smaller  than  those  that  work  in 
practice: 

Proposition  5:  Let  H  -\M  be  an  n  by  n  -^-s.d.d.  definite  pencil,  and  let  Y  be  the  output  of  the 
above  reduction  algorithm.  Define 

y  =  ((2n)>'^--fl)-^-[2  +  {{Iny^^lW'W^    . 
Then  if 

l-7_' 
which  will  be  true  for  -y  small  enough,  Y  will  be  y-s.d.d. 

The  proof  of  Proposition  5  requires  the  following  lemma: 

Lemma  4:  Let  M  be  an  n  by  n  y-s.d.d.  matrix  with  unit  diagonal.  Let  L  be  its  lower  triangular 
Cholesky  factor.  Then    \\L  - 1  \\  s  ((2n)"' +  1)7''^  and    ||L-'-/||  ^  {{Inf- +  1^'' lil-y)"' . 
Also,  (1+7)""^  ^    11^"' II  ^  (1-7)"'". 
Proof:  Let  X=Z,  -/  and'W=L"'-/  =   -L"'X.  Since 

(l-7)"'2  s  \]ll{M)  =  a^,,{L)  ^  L„  ^  a,3x(i)  =   ^i,'L(M)  ^   (1  +  7)"' 

we  have  \X„\  =  |Z,„-l|  ^  l-(l-7)"^  ^  7'''.  Also,  we  may  bound  the  norm  of  the  ;-th 
subdiagonal  column  of  X  as  follows: 

1  +  7  s   ||[L,,,  ,L,.,,,  ,  •  •  •  ,i.n,]||-  =    ||[L,,,  ,X,.,.,-,  •  ■  •  .X^JH'  a  I-7  +    l|X,-,;„.,  ||- 

whence  i|X,^i.„ ,  ||  ^  (27)"'.  Thus  \\X\\  ^  {InyY'^ ^■^'''^  as  desired.  Finally, 
1|TV||<   IlL-'li-liXll^  (l-7)-"'llX||.D 

Proof  of  Proposition  5:  By  applying  the  first  two  steps  of  the  reduction,  assume  without  loss 
of  generality  that  M  has  unit  diagonal  and  that  |W„|  ^  |//,_i  ,.,1  for  all  i.  Let  L  ,  denote 
the  i-th  column  of  L  and  similarly  for  Lj..  Also,  let  G*'-*^  denote  the  leading  ;  by  ;  submatrix 
ofG.    LetZ)  =  diag(|/f„  I''-)  and  L  "'  =  /'  + W.  Then 

Y,j  =  L7,^HL~J  =  H,j  +  W,^.HL~J  +  L'.^HWj  +  W.^.HW  ^    . 

Therefore 

\Y,j-H,j\  ^    \W,^.HL-J\  +    \L-'HW  j\  +    \W,^HW  J 

^  (2||W|l-i|L-'||  -    \\W\\')\\H^--^^\\  <  (2||W|M|L-'||  -    ||W||-)D„D,,(l-7)    • 
Now  insert  the  bounds  of  Lemma  4  to  get 
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\(D-\Y-H)D-'),j\  ^  ((2n)>'2  +  l)-|i^-[2  +   ((2n)"-  +  l)^"-]-7"^  =  7'    ■ 

Finally,  letting  Dy  =  diag(|r„|"^) 

||offdiag(D,-rD,-')||^    (n  +  l^^  +  T 

1-7 

follows  from  simple  norm  inequalities.  □ 

In  order  to  apply  Proposition  4  or  Theorem  4,  however,  we  must  arguethat  small  rela- 
tive perturbations  in  H  and  M  (of  the  type  permitted  in  Proposition  4  and  Theorem  4)  cause 
small  perturbations  of  the  same  type  in  Y: 

Theorem  5:  Let  H  -\M  =  D^AhD H-><Df,,A.MDM  be  a  -i-s.d.d.  definite  pencil,  where  A^  and 
Am  have  ±\s  on  their  diagonals.  Let  Y  =  DyAyDY  be  the  output  of  Algorithm  1  applied  to 
H-kM.  Assume  that  Y  is  y-s.d.d.  (y  may  be  smaller  than  the  expression  in  Proposition  5).  Now 
define  HiO  =  Dh{Ah+ l,E,i)Dfj.  and  similarly  M{1)  =  D!^(Am+ I,Em)D!^.  where 
\\Efi\\  =  \\E/^\\  =  1.  Let  y(0  be  the  output  of  the  reduction  algorithm  applied  to  H{t,).  Then 
for  asymptotically  small  4 

and  the  eigenvalues  X.,(0  of  H  ii)-\M  {(,)  satisfy 

^        a-y)il-yr-  ^^^         ^.(0)  (l-7)(l-7)^ 

For  the  proof  of  Theorem  5  we  require  the  following  lemma: 

Lemma  5:  Let  M  be  a  y-s.d.d.  symmetric  matrix  with  unit  diagonal.  Let  L  be  its  lower  triangu- 
lar Cholesky  factor.  Let  L+hL  be  the  lower  triangular  Cholesky  factor  of  the  perturbed  matrix 
M  +  5Af .  Then 

1/2 


8L 


U-^J 


|5A/||  +  0{\\bM\\')    . 


Proof:  It  suffices  to  consider  M  diagonal,  since  the  Cholesky  factors  of  M  and  QMQ^  {Q 
orthogonal)  have  the  same  norm.  Equating  first  order  terms  on  both  sides  of 
M^hM  =  (L+6L)(Z,-r5L)'"  yields  bL,j  =  M'^'^hM.j  (/>;)  and  5L„  =  .5-Af ,7'-5M„  (i=;). 
Taking  norms  yields  the  result.  □ 

Proof  of  Theorem  5:  By  applying  the  first  two  steps  of  Algorithm  1,  we  may  assume  without 
loss  of  generality  that  A/„=l  and  |W„  |  £  |//,>i,,^i|.  Let  L(0  be  the  Cholesky  factor  of 
M(Q.  Then  the  eigenvalues  of  //(C)-XAf(O  are  the  eigenvalues  of 
y(0  =  L-'^{i)H{C,)L-'^{Q.  Letting  M{0  =  M  +bM,  L(C)  =  L  +  6L,  and  Hil)  =  H  +  ?,H,  we  get 
that  to  first  order 

=  L'^HL''^  -  L'^hLL'^HL'^  +  L'^hML''^  -  L~'^HL'^hL^L~'''    . 
Therefore  to  first  order  in  C, 

\iYa)-Y).j\  <Z)^,,D,4.,;(2-(l  +  7)-|l^-'IP-n"--(l-7)-'''+||Z^-'ir')S 

as  desired.  Applying  Theorem  4  yields  the  final  result.  □ 
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We  may  apply  Theorem  4  to  analyze  the  convergence  criterion  for  the  QR  algorithm  for 
eigenvalues  of  symmetric  tridiagonal  matrices  [17].  In  the  course  of  running  the  QR  algo- 
rithm on  a  symmetric  tridiagonal  matrix  one  must  examine  the  matrix 


T  = 


aj      bj 

bj    Oj^i 


and  decide  whether  fc,  can  be  set  to  zero  ("convergence")  without  making  unacceptably  large 
perturbations  in  the  eigenvalues.  Theorem  4  tells  us  that  if  T  is  -y-s.d.d.,  then  setting  bj  to 
zero  makes  relative  errors  no  larger  than 

^^P(|       '^•'''  11/2 -TrT)  -  1  (6-4) 

in  any  eigenvalue.  This  result  is  attractive  because  it  is  inexpensive  and  purely  local;  it  only 
depends  on  b,  and  its  neighbors  Oj  and  a^^j  on  the  diagonal  of  T.  It  does  differ  from  the 
standard  criterion  which  essentially  asks  if 

is  small;  this  criterion  is  weaker  than  (6.4)  and  does  not  guarantee  high  relative  accuracy. 

Unfortunately,  even  using  (6.4)  as  a  convergence  criterion  does  not  guarantee  that  QR 
will  compute  eigenvalues  with  high  relative  accuracy;  there  are  examples  which  are  even 
fairly  strongly  s.d.d.  of  QR  computing  eigenvalues  with  incorrect  signs,  i.e.  no  relative  accu- 
racy at  all.    We  discuss  this  further  in  Section  10. 

7.  Perturbation  Theorems  for  Eigenvectors 

In  this  section  we  discuss  the  sensitivity  of  the  eigenvectors  of  symmetric  s.d.d.  matrices 
under  the  same  small  perturbations  as  in  section  6.  As  discussed  in  the  introduction,  the  stan- 
dard perturbation  bound  (1.3)  is  proportional  to  the  reciprocal  of  gap{\i)  =  min  |\,-\j|. 

7*1 

Thus,  if  the  absolute  distance  from  \,  to  its  nearest  neighbor  is  small,  we  expect  the 
corresponding  eigenvector  to  be  sensitive  to  perturbations.  Our  first  result  will  be  an  analo- 
gous theorem  which  replaces  the  gap  with  the  relative  gap 

|X,->^*| 
relgap(\,)  =  min  .,^     ,  (7.1) 

;*'       |X,X;i 

which  may  be  large  even  when  the  usual  gap  is  small. 

Even  more  may  be  shown.  Proposition  6  will  show  that  the  eigenvectors  are  scaled 
analogously  to  the  matrix  entries:  if  x,  is  the  eigenvector  for  X,,  and  \j  differs  from  k,  by  a 
large  factor,  the  y-th  component  of  x,  will  be  small.  Theorem  7  will  show  that  small  relative 
perturbations  in  the  matrix  only  cause  small  perturbations  in  the  eigenvector  entries  relative 
to  their  upper  bounds  of  Proposition  6;  thus  some  tiny  eigenvector  components  may  be  deter- 
mined to  high  relative  accuracy  as  well.  Finally,  we  discuss  eigenvector  bounds  for  definite 
pencils  and  singular  vector  bounds  for  bidiagonal  matrices  (partially  settling  a  conjecture 
from  [9]). 

Perturbation  theory  and  algorithms  for  conventionally  diagonally  dominant  nonsym- 
metric  matrices  were  developed  in  [1],  under  the  assumption  that  the  gap  berween  eigen- 
values greatly  exceeded  the  norm  7  of  the  offdiagonal  part.  Thus  these  results  apply  to  gen- 
eral nonsymmetric  matrices,  but  require  much  stronger  diagonal  dominance  assumptions  than 
we  do  and  are  weaker  than  our  results. 
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Theorem  6:  Let  H  =  ^AL  be  a  y-s.d.d.  symmetric  matrix  with  respect  to  the  2-norm.  Let  E  be 
symmetric  and  have  2-norm  one,  and  define  //(4)  =  A(A  +^£)A.  Let  X.,(C)  be  the  i-th  eigenvalue 
of  H  {Q,  and  assume  X,(0)  is  simple  so  that  the  corresponding  unit  eigenvector  x^O  is  well 
defined  for  sufficiently  small  11,.  Then  for  asymptotically  small  £ 


kCO  -  ^,(0)11  ^  - — ^"    V^    ,.  -  +  oiC) 

(1  -  -y)  relgap{\i) 


Proof:  From  [19]  we  have 


Let  yj.  =  Axj..  Then 


x^  \E  \x 

x,a)  =  x,(0)  +  £2  77 TT^*  +  ^(^') 


The  pair  (Xj-.y^)  is  an  eigenpair  of  the  pencil  A  —  \A    -.  Thus  y^Ayj.  =  Xj.y^A    -y^.    From 
Lemma  3  we  have 

(1  -  7)11^*11'  ^    \yUy,\  =    |X,H|A-'y,ll=  =    lX,|  S  (1  +  7)lklP    • 
Thus 

(-^)"^^    ||y,||^(-^^)"^  (7.2) 

1  +  -y  1  ~  7 

If  we  let  Z).  =  yi-l  Wy^.  \\  then 

k*i        (X,  —   Xjt)/  |X,Xt| 
where  (1  +  7)"'  s    |^,^|  s  (1  -  7)"'.  If  we  take  norms  then 


x,(0  =  x,(0)  +  42  ^,^7^ ;\,,;   ,    ,„,;    +  0{C-)  (7.3) 


k(0  -  x,(0)||  ^ ^"     ^^'^     .^  .    +-  O(C^) 

(1  -  y)  relgap{\,) 


as  desired.  D 


Corollary  3:  Let  H{1,),   X,(0.   and  x,{t,)  be  as  in  Theorem  6.  Assume  further  that  H{1,)  is  y- 
s.d.d.  for  all  0:s;<^,  that  l-7-3nC>0,  and 

relgap{\,)  >  f-    . 

l-7-3nC 

Then 


lk(O--,(0)||  ^  ^^^^^=^^^^ ^_,^,     -     . 

2-(l-7-3nO-(rWgflp(X,)  -    ^'-~    ""'i  ) 

l-7-3n{ 

Proof:  The  idea  is  that  if  ^  is  small  enough,  the  Xt(0  can  only  change  by  a  small  relative 
amount,  so  the  relative  gap  can  only  change  by  a  small  absolute  amount.  From  Proposition  4, 
we  can  bound  the  perturbed  relative  gap  from  below  as  follows: 


-....1--^ 

relgap{\,{Q)  =   min   '    ')''' — ^  '''^''/,.\    a  min- 


|x,(5)-x,(o|  |x,(0)-x,(0)|- Y^(ix,(0)K  l>^*(0)l) 

'-  !>     min-"         ■  -  ,■--.■.■  i-       i 


|x,(Ox,(OI"'       -  |x,(o)x,(0)|(i-^^) 

1-7 


■1 


17 


-  (i-^)Ti»K-'«.p(^.(0),x.(0))  -  TJ7    |x,(0)x.(0)|"'    ' 

where  re/gap (X,(0),Xt(0))  =    |X,(0)- X,(0)  |- |X,(0)Xt(0)  |  ""^ 

We  consider  two  cases,  ^e/gflp(X,(0),X^(0))s2-"^  and  re/gap(X,(0),Xt(0))^2-"-.  The 
first  case  corresponds  to  X,(0)  and  Xt:(0)  differing  by  at  least  a  factor  of  2,  whence 

|X,(0)|  +   1x^(0)1 


mm 


**.      |X,(0)Xt(0) 


11/2 


<  3-relgapi\,iO),\,iO)) 


and 


r«/gap(X,(0,X.(0)s  (l-^)Te/gap(X,(0),X^(0))-(l-YI^)    • 

The  second  case  corresponds  to  X,(0)  and  Xt(0)  differing  by  at  most  a  factor  of  2,  whence 

|X,(0)|  +    |Xt(0)| 


min- 


and 


*-.   ix,(o)Xi(o)r 


__iLL 


1/2 


3.2-1/2 


re/gap  (X,(0,Xt(0)  ^  {l- j^)- {re  I  gap  i\,iO),\^iO))  - 


3-2" 


^ 


1-7 


(7.4) 


Altogether,  we  have 


relgapiKU))  ^  (1- ^)-(l- ff^)(r./gap(X,(0))  -    '^J  J^/^'/l 


_    (3-2-'^-nC)/(l-7) 

7) 


Now  integrate  the  bound  of  Theorem  6  from  ^  =  0  to  4=  C  to  get  the  desired  result.  □ 

The  next  theorem  shows  that  the  components  of  each  eigenvector  are  scaled  analogously 
to  the  way  the  matrix  is  scaled: 

Proposition  6:  Let  H=AAA  be  a  y-s.d.d.  symmetric  matrix  with  respect  to  the  2-norm.  Let 
H=X\X^  be  its  eigenvector  decomposition,  where  X=[x\,  .  .  .  ,x„]  is  the  matrix  whose 
columns  are  orthonormal  eigenvectors  and  A  =  diag(Xi,  .  .  .  ,X,).  Let  x,{j)  be  the  j-th  com- 
ponent of  X,.  Then 


We  also  have 


|j^,0')l  ^  ^.0) 


<iij)  I 


1±X 


U-^J 


3/2 


■min( 


1/2 


X, 


1/2 


) 


1^7 


3/2 


ll-^J 


"*ii        '-•11 


Proof:    Let   y,  =  Ax,    for    all    /.    First   we    consider    the    case    A„sA,^.    From    (7.2)    we    have 
||y,  II  s  ( |X,  1/(1—7))'*.    Thus,  applying  Proposition  2  as  well. 


kO-)l  =  A,^'|y,0-)l  ^  A-'(lx,|/(i-7))"- 

as  desired.  We  may  also  write 

-1 


1-^7 


X;  I-7J 


1/2 


kO-)l  =  ^Jj'\y:iJ)\  ^  A-U|x,|/(i-7))"'  :£  -^- 

to  get  the  other  inequality. 


l±x 


\  1/2 


U-^J 
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Now  consider  the  case  A„S;A^j.  We  will  take  the  y-th  components  of  both  sides  of  the 
equality  Ay,  =  X,A~"y,,  and  bound  them  as  follows.  The  left  hand  side  component  is 
bounded  above  in  absolute  value  by 

(l  +  7)lhll^  (l  +  7)(l-7)-"'l^,l"'    ■ 
The  right  hand  side  component  is  bounded  below  in  absolute  value  by 

\\,\^Jr\yi{j)\  ^  iX-y)\^J^jV\y.iJ)\  ■ 

Thus 


k,0-)l  =  Aj-'ly-a) 


,_(ii2ll)^ 

''    (l-7)'''l^,l 


1/2 


3/2 

1/2 


as  desired.  The  bound  in  terms  of  A;j/A„  is  obtained  similarly.  □ 

Thus,  if  a  matrix  is  strongly  scaled  (the  Ajj  vary  greatly  in  magnitude),  the  eigenvectors 
will  be  strongly  scaled,  and  small  relative  perturbations  in  the  matrix  entries  will  not  be  able 
to  change  the  smaller  eigenvector  components  much.  In  the  next  theorem,  we  prove  some- 
thing even  stronger:  the  perturbations  in  the  eigenvector  components  x,(J)  will  be  small  com- 
pared to  the  upper  bounds  x,(j): 

Theorem  7:  Let  H(Q  be  as  in  Theorem  6.  Let  x,{Q{j)  denote  the  j-th  component  of  the  i-th  unit 
eigenvector  of  H{t,).  Let  x,(J)  be  the  upper  bound  for  the  j-th  component  of  the  unperturbed 
unit  eigenvector  x,(0)(J).  Then 

WOO-)  -  ^,(0)0') I  ^  c--^ — r~"TT^4rv~FT^'^''^')  ^  ^^^'^ 

il-y)-mm{relgap{\,)  ,  2    '  •^) 

fX,  is  the  same  as  X.,(0)j.  Note  that  relgap{\,)  exceeds  2~^'^  only  when  X,  differs  from  its  nearest 
neighbor  by  a  factor  greater  than  2  or  less  than  112. 

Proof:  We  start  from  (7.3)  in  the  proof  of  Theorem  6;  it  implies 
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as  desired.  □ 


A  version  of  Theorem  7  for  nonasymptotically  small  C,  can  be  proven  in  the  same  way 
as  Corollary  3  was  derived  from  Theorem  6. 

If  we  have  a  cluster  of  eigenvalues  which  are  relatively  well  separated  from  the  others, 
similar  analyses  to  those  above  show  that  the  invariant  subspace  they  span  is  insensitive  to 
perturbations,  even  if  the  individual  eigenvectors  are  sensitive. 
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Now  consider  s.d.d.  definite  pencils.  From  Proposition  5  of  the  last  section,  we  know 
we  can  often  reduce  such  pencils  to  standard  form  without  sacrificing  diagonal  dominance: 
Given   H-\M,   M    positive    definite    with    lower    triangular    Cholesky    factor   L,    the    matrix 


y  =  L-' 


HL  '"^  has  the  same  spectrum  as  //  -  \M.  Also,  if  y  is  an  eigenvector  of  y,  x  =L    'y  is 


an  eigenvector  of  H  -\M\  thus  Theorems  6  and  7  can  be  used  to  derive  perturbation  bounds 
on  X,  although  we  will  not  do  so  here. 

Finally,  we  consider  perturbation  theory  for  the  singular  vectors  of  bidiagonal  matrices. 
Let 


B  = 


a-i    b- 


(7.5) 


be  bidiagonal,  where  as  in  Theorem  1  we  may  assume  without  loss  of  generality  that  all  a, 
and  b,  are  positive.  Recall  that  the  left  singular  vectors  of  B  are  the  eigenvectors  of  BB^  and 
the  right  singular  vectors  of  B  are  the  eigenvectors  of  B^B.  Since 
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small  relative  perturbations  in  the  a,  and  b,  only  cause  small  relative  perturbations  in  the 
entries  of  BB'  and  B'B.  Therefore,  we  can  reduce  perturbation  theory  for  the  singular  vec- 
tors of  a  bidiagonal  matrix  to  perturbation  theory  for  the  tridiagonal  matrices  BB^  and  B^B. 

Proposition  7:  Let  B  be  as  in  (7.5).  Since  BB  and  B^B  are  positive  definite  tridiagonal,  and 
hence  s.d.d.,  small  relative  changes  in  the  entries  of  B  cause  perturbations  in  the  singular  vec- 
tors as  described  by  Theorems  6  and  7.  More  specifically,  let  Di  =  diag((SS' )„)  and 
Dr  =  diag((S^S)„).  Then 
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Both  yi  and  y^^  are  bounded  by  2.    If  aj>3^'^b,,  then  yi  <  1,  and  if  aj>3^''^bj-i,  then  -vy;  <  1. 
Proof:  A  simple  computation  shows  that  the  diagonal  of  Dl^''^BB^Dl^''  consists  of  ones  and 
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b„-^-{al-i+bl-i)~'^'-  for;  =  n-l.    Thus  yi  bounds  the  1-norm  of  Di^'^BB'''Di^''^  -  I.  A 
similar  computation  applies  to  fl   B.  n 

Thus,  the  sensitivity  of  the  singular  vectors  to  relative  perturbations  in  the  entries  of  B 
is  governed  by  the  relative  gap  between  singular  values,  as  conjectured  in  [9].  Actually, 
more  was  conjectured:  it  does  not  appear  that  the  measure  of  diagonal  dominance  y  of  BB^ 
or  B^B  affects  the  singular  vector  sensitivity.  A  proof  of  this  stronger  conjecture  will  appear 
in  [6].  More  precisely,  a  version  of  Theorem  6  is  proved  in  [6]  without  the  I-7  factor  in  the 
denominator,  and  extended  to  nonasymptotically  small  perturbations  as  in  Corollary  3. 
Whether  Theorem  7  can  be  generalized  without  a  1/(1-7)  dependence  is  an  open  question. 


8.  On  Condition  Numbers  and  the  Distance  to  the  Nearest  Ill-Posed  Problem 

In  [7]  it  was  observed  that  a  common  feature  of  many  numerical  analysis  problems  is 
that  their  condition  numbers  approximate  or  at  least  bound  the  reciprocal  of  the  distance  to 
the  nearest  ill-posed  problem,  i.e.  problem  whose  condition  number  is  infinite.  The  classical 
example  of  this  is  matrix  inversion,  where  the  condition  number  of  a  matrix  T  is 
k(T)  =  \\T\\-  Ijr"'  ||.  By  scaling  T  we  may  assume  without  loss  of  generality  that  ||r|l=  1,  so 
that  k(T)  =  ||r"'  II .  But  the  distance  (in  the  ||- 1|  norm)  from  T  to  the  nearest  singular  matrix 
(nearest  matrix  whose  condition  number  is  infinite)  is  ||r~'  ||  =  1/k(T).  Thus,  the  condition 
number  is  exactly  the  reciprocal  of  the  distance  to  the  nearest  ill-posed  problem.  This 
phenomenon  recurs  throughout  numerical  analysis,  although  we  usually  get  a  somewhat 
weaker  relationship,  such  as  one  sided  bounds  between  the  condition  number  and  reciprocal 
distance. 

Here  we  investigate  this  phenomenon  in  the  case  of  finding  the  eigenvectors  of  a  sym- 
metric matrix.  From  (1.3),  we  see  that  l/gap(X.,)  is  a  condition  number  for  the  j-th  eigenvec- 
tor of  a  general  symmetric  matrix.  This  condition  number  is  infinite  precisely  when 
gap (\i)  =  0,  i.e.  \,  is  a  multiple  eigenvalue.  It  is  reasonable  to  call  such  an  eigenproblem  ill- 
posed  because  the  eigenvector  is  no  longer  uniquely  determined:  any  vector  in  an  at  least 
two-dimensional  invariant  subspace  will  do.  It  is  elementary  to  show  that  the  reciprocal  of 
this  condition  number  gives  exactly  the  distance  to  the  nearest  ill-posed  problem: 

Proposition      8:      Let     H     be     a     symmetric     matrix      with      simple      eigenvalue      X,;      thus 
gapCK,)  =   min  |X.,— Xj.|  >  0.     Then    the    smallest    ||5//||    such    that   the    eigenvalue   of  H  ^hH 

"corresponding  to"  X,  is  multiple  is 


min  WhHW  = 


By  "the  eigenvalue  corresponding  to  X,  is  multiple"  we  mean  that  if  the  continuous  function  X,(^) 
is  an  eigenvalue  of  H  +  i?>H  for  all  0£^<1,  with  X,(0)  =  X„  then  X,(^)  is  simple  for  0<4<1 
and  X,(l)  is  multiple. 

Proof:  Suppose  |Xj-X,|  =  gap(\,)  and  that  x,  and  xj  are  corresponding  unit  eigenvectors.  Let 
6W  =  .5-{\j-k,)-{x,xl  -  XjxJ)  to  show  rain  ||S//||  ^  gap{\,)l2.  To  get  the  other  inequality 
apply  (1.1)  to  see  that  any  smaller  ||5//||  could  not  move  either  X,  or  X^  more  than  half  the 
distance  towards  one  another.  D 
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It  turns  out  that  a  similar  relationship  holds  for  7-s.d.d.  symmetric  matrices,  provided 
we  measure  distances  in  the  scaled  way  used  so  far  in  this  paper,  and  that  we  use 
llrelgap{\,)  as  a  condition  number.  This  is  interesting  because  it  extends  work  in  [7]  to  a 
case  where  the  distance  metric  is  quite  skewed  from  the  usual  norm,  and  shows  that 
\lrelgap{\i)  is  the  most  natural  condition  number  for  this  problem,  because  it  shares  the 
same  geometric  properties  as  other  condition  numbers. 

Proposition  9:  Let  H=^AA  be  an  n  by  n  y-s.d.d.  symmetric  matrix  with  simple  eigenvalue  X.,; 
thus    relgapiX,)  =  min  iX,- X.^  |  •  |X,Xt  I""*  >  0-     Assume  further    that   rc/gap  (X,)s2~"^    (this 

means  that  X,  and  its  nearest  neighbor  differ  by  a  factor  between  .5  and  2).  Then  the  smallest 
||5A  II  such  that  the  eigenvalue  o/A(A  +5A)A  "corresponding  to"  X,  is  multiple  satisfies 


2'^^-3-n      ,  3  ^          2'^--3-n      ,  3-^,      2'^^-12-n(l- 7)  ^w2 
relgapiX,)  ^  relgapjk,) relgapjk,) 

2"--6-n 


^  min     8A 


relgap(\,) 


2  r./gflp(X,)-2>'=-n-^^^^ 


For  relgap{\,)«l,  the  lower  bound  on  min  ||8A  jj  equals 

re/gap(X,)(l--y)/(2"^-3-n)  +  0{{relgap  (X,))^).  In  other  words,  both  the  upper  and  lower 
bounds  are  Q{relgap(X,)). 

Proof:  By  scaling  H  we  may  assume  without  loss  of  generality  that  A„  =  1.  Let  X^  satisfy 
relgap{X,)  =    |X,- X^  j ■  |X,X^  | "' ,  and  let  x,  and  x^  be  corresponding  unit  eigenvectors. 

First  we  prove  the  upper  bound  on  min||6A||.  From  Proposition  6  we  have  that 
Kx.xf),,!  s  (l  +  7)^(l-7)"'AtA,.  Thus  ||A-'x,xrA-'||  s  n(l  +  7)'(l--Y)"'-  Let 

5A   =   {Xj-X,)x,xJ.  Clearly  A(A  4-5A)A  has  a  multiple  eigenvalue  at  X^  as  desired.  Also, 

|X,-X,|   =   relgapiX,)-\X,Xj\"-  ^  2"'relgapiX,)-\X.\  s  2"'-il+y)-relgapiX,)    , 

which  when  combined  with  the  bound  on  ||A~'x,xfA'''  ||,  yields  the  desired  upper  bound. 

Now  we  consider  the  lower  bound.  Abbreviate  ||6A  ||  by  ii.  Note  that  if  H  is  7-s.d.d., 
then  H^A5AA  is  (-/  +  2Ti)(l-Ti)"'-s.d.d.,  if  (-/  +  2-n)(l-ii)~'  <  1.  From  (7.4)  in  Corollary 
3,  we  see  that  if  (7  +  2ti)(1  — •n)~'<l,  then  the  perturbed  relative  gap  will  be  at  least 

relgapiX.)  —    ' . 

1  -   (7  +  2ti)(1-^)-' 

In  order  for  the  perturbed  relative  gap  to  be  zero,  this  lower  bound  will  have  to  be  nonposi- 
tive,  and  so 

releap  (X.)  , 

^-    ..1-        (^  -  (7  +  2Ti)(l--n)-') 

3-2      -n 

Solving  for  the  smallest  t\  satisfying  this  inequality  yields  the  desired  lower  bound.  When 
relgap(Xj)«l  so  that  j)  is  small  enough  that  (7  +  27i)(1-t))~'~7,  we  get 
Ti  =  re/gap(X,)(l-7)/(3-2''--n)  as  desired.  □ 

The  same  results  could  have  been  obtained  using  the  general  machinery  of  differential 
inequalities  in  [7],  but  these  proofs  are  more  straightforward. 


9.  Algorithms  for  the  Bidiagonal  Singular  Value  Decomposition 

In  this  section  we  discuss  algorithms  capable  of  attaining  the  high  relative  accuracy 
inherent  in  the  data  as  described  in  Theorem  1.  Most  of  this  work  has  appeared  elsewhere, 
but  since  we  will  need  the  results  in  the  next  section,  we  outline  them  here. 

The  three  classes  of  algorithms  we  will  discuss  are  QR,  bisection,  and  divide  and  con- 
quer. The  standard  QR  iteration  [12]  as  implemented  in  LINPACK  [3]  does  not  compute  all 
singular  values  to  high  relative  precision.  It  may  be  modified,  however,  to  achieve  this  as 
described  in  [9].  Briefly,  the  idea  is  to  use  a  zero  shift  in  a  QR  sweep  when  a  tiny  singular 
value  is  present.  It  turns  out  this  zero-shift  QR  can  be  implemented  in  a  forward  stable  way 
that  only  introduces  small  relative  errors  in  each  entry  of  the  bidiagonal  matrix.  Corollary  1 
of  Theorem  1  then  guarantees  the  singular  values  are  not  changed  significantly.  The  standard 
convergence  criterion  must  also  be  changed  to  guarantee  high  relative  accuracy;  see  [9]  for 
details.  The  resulting  algorithm  is  not  only  more  accurate  than  the  standard  implementation 
but  faster  on  the  average;  this  is  because  the  zero-shift  QR  sweep  contains  significantly  fewer 
floating  point  operations  than  shifted  QR. 

It  was  conjectured  in  [9]  that  this  modified  QR  algorithm  computes  the  singular  vectors 
as  accurately  as  the  "relative  gap"  error  bounds  of  section  7  permit;  this  conjecture  is  sup- 
ported by  Proposition  7  and  will  be  proven  completely  in  [6]  (see  section  7  for  discussion). 

Bisection  is  another  method  that  guarantees  high  relative  accuracy.  An  error  analysis  of 
the  Sturm  sequence  recurrence  for  counting  the  number  of  singular  values  of  a  bidiagonal 
matrix  in  an  interval  [14]  shows  that  it  computes  the  exact  number  of  singular  values  for  a 
matrix  differing  from  the  original  one  only  by  small  relative  perturbations  in  each  entry; 
Corollary  1  of  Theorem  1  then  guarantees  high  relative  accuracy  again. 

Divide  and  conquer  [13]  has  not  yet  been  shown  to  achieve  high  relative  accuracy,  at 
least  without  resorting  to  extended  precision  arithmetic  in  the  inner  loop.  Achieving  this  accu- 
racy is  a  current  area  of  research. 

10.  Algorithms  for  the  Symmetric  Tridiagonai  Eigenproblem 

In  this  section  we  present  algorithms  for  computing  eigenvalues  of  -y-s.d.d.  symmetric 
tridiagonai  matrices  to  high  relative  accuracy.  We  note  that  reducing  a  dense  -y-s.d.d.  sym- 
metric matrix  to  tridiagonai  form  will  not  generally  preserve  diagonal  dominance  or  the  accu- 
racy of  the  eigenvalues;  thus  the  algorithms  in  this  section  are  suitable  only  when  the  original 
matrix  is  tridiagonai.  If  the  original  matrix  is  dense,  the  algorithm  in  the  next  section  should 
be  used. 

Briefly,  bisection  can  always  be  used  to  find  the  eigenvalues  accurately.  If  the  matrix  is 
positive  definite  as  well,  Cholesky  followed  by  the  algorithm  of  the  last  section  applied  to  the 
bidiagonal  Cholesky  factor  can  be  used.  QR  does  not  seem  to  work  in  general,  but  may  if  the 
matrix  is  strongly  diagonally  dominant  and  monotonically  graded.  It  is  still  an  open  question 
whether  divide  and  conquer  techniques  [5,  11]  can  achieve  high  relative  accuracy. 

As  with  the  bidiagonal  singular  value  problem,  the  standard  implementation  of  QR  for 
tridiagonai  matrices  does  not  guarantee  high  relative  accuracy  in  the  computed  eigenvalues 
for  symmetric  7-s.d.d.  matrices,  even  if  the  change  in  convergence  criterion  suggested  in  sec- 
tion 6  is  adopted.  Even  if  a  zero-shifted  QR  algorithm  like  the  one  used  for  the  bidiagonal 
singular  value  decomposition  is  used,  relative  accuracy  is  lost  (in  numerical  experiments  on  a 
strongly  s.d.d.  positive  definite  matrix,  negative  eigenvalues  were  computed). 

However,  recent  work  by  Le  and  Parlett  [16]  gives  some  hope  that  zero-shifted  tridiago- 
nai QR  may  sometimes  be  used  in  the  same  way  as  the  bidiagonal  zero-shifted  QR  to  com- 
pute all  eigenvalues  to  high  relative  accuracy.  Their  work  shows  that  the  inner  loop  of  the 
standard  QR  iteration  may  be  modified  to  provide  componenrwise  relative  stability  in  the 
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following  sense:  in  floating  point  this  modified  zero-shifted  QR  is  equivalent  to  making  small 
relative  perturbations  in  each  entry  of  the  tridiagonal  matrix,  performing  QR  exactly  on  this 
perturbed  matrix,  and  again  making  small  relative  perturbations  in  each  entry  of  the  resulting 
matrix.  This  is  a  much  stronger  kind  of  stability  than  the  usual  kind  described  in  paragraph 
(1.2)  of  the  introduction.  If  the  original  and  final  tridiagonals  are  also  -y-s.d.d..  Proposition  4 
would  imply  that  their  eigenvalues  agree  to  high  relative  accuracy.  Thus,  QR,  combined  with 
the  stopping  criterion  (6.4),  could  be  used  to  compute  all  eigenvalues  to  high  relative  accu- 
racy, but  only  if  all  the  matrices  produced  in  the  course  of  the  iteration  were  7-s.d.d.. 
Unfortunately,  numerical  experiments  show  this  is  not  the  case  in  general,  and  may  only  be 
true  if  the  original  matrix  is  very  strongly  diagonally  dominant  and  monotonically  scaled 
(Hi,^H,-i  ,-1).  Thus,  we  do  not  expect  to  be  able  to  generally  use  QR  based  algorithms  for 
the  -y-s.d.d.  tridiagonal  eigenproblem. 

If  we  limit  ourselves  to  positive  definite  matrices  T,  the  following  QR  based  approach 
will  work.  The  following  algorithm  originally  appeared  in  [10].  Recall  that  a  tridiagonal 
matrix  T  is  positive  definite  if  and  only  if  it  has  a  positive  diagonal  and  is  -y-s.d.d.  for  some 
7<1. 

Algorithm  2:  Computing  the  eigenvalues  of  a  positive  definite  tridiagonal  matrix  T: 

1)  Compute  the  Cholesky  factorization  LL^=T  of  T. 

2)  Find  the  singular  values  of  the  bidiagonal  matrix  L  using  the  bidiagonal  QR  algorithm 
of  section  9. 

3)  Square  the  singular  values  of  L  to  get  the  eigenvalues  of  T. 

To  show  that  this  method  is  viable,  one  needs  to  show  that  scaled  diagonal  dominance  is 
sufficient  to  guarantee  that  the  squares  of  the  exact  singular  values  of  L  are  all  relatively 
close  to  the  eigenvalues  of  A;  the  algorithms  of  section  9  then  guarantee  that  we  can  compute 
the  singular  values  of  L  to  high  relative  accuracy. 

To  this  end  we  present  a  backward  error  analysis  of  the  Cholesky  decomposition  of  a 
positive  definite  symmetric  tridiagonal  matrix  A.  Our  goal  is  to  show  that  the  computed 
Cholesky  factor  L  is  a  small  componentwise  relative  perturbation  of  the  exact  Cholesky  factor 
of  a  small  componentwise  relative  perturbation  of  A.  We  assume  the  usual  model  of  floating 
point  arithmetic // (a  op  b)  =  {a  op  Z>)-(l-t-e),  where  |e  |s€  and  op  €  {+  ,-  ,^  j),  and  that 
the  floating  point  square  root  function  j^rr  satisfies  j-^rr(a)  =  (l-i-«)a''^  where  |c  |se. 

Proposition  10:  Let  A  be  an  n  by  n  positive  definite  symmetric  tridiagonal  matrix  with  diagonal 
entries  ax,  .  .  .  ,a„  and  offdiagonal  entries  b^,  .  ■  .  ,b„-i.  Let  L  be  the  computed  Cholesky  fac- 
tor from  the  following  algorithm: 

h\   =  sqrtiax) 
for  1  =  1  to  n  —  \ 

h-\.i-\  =  sqrt{a,.x-il-\.i) 
endfor 

Then  barring  over  I  underflow  and  attempts  to  take  square  roots  of  negative  arguments.  L  is  the 
exact  Cholesky  factor  of  A=  A  ~h  A  =  LL^ ,  where   |8i4,^|  ^  ^(€)|Ay|,  and 

g(6)  =  3e-3€-^e^  ^  (4e  +  6€--^ 46^-^6^)-  ''^^^\    =  7e-^0(e-)  (10. 1) 

Proof:  We  construct  A=A~hA  as  follows.  Subscripted  es  denote  independent  quantities 
bounded  in  norm  bv  e. 
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/„   =  flisqrtia,))  =   (l  +  €„)ai'^'   =   ((1  + €„)'«  i)"'  =  ^ ,  "^ 
/,,_!   =  /;(Z7,_i/;,_,.,_i)   =   (l+e,!)^,-!/'.-!..-!  =  ^<-l/',-l,,-i 
/,,  =  fl{sqrtia,-ll,-i)) 

=    (l+€,2)-((l+e.3)(fl.-(l+e.4)'r.,-l))'" 

=    ((l+e,2)-(l  +  €,3)fl,    -    (l+C,2)-(l+e,3)(l+e,4)/7,,_,))"' 

-  ((l  +  g,i)^,  -  (l  +  ^,:)'r.,-i)'" 
where     |^,i  |  £  3e  +  3e-  +  e3  =  3e+ O(e-)     3^^     j^,.  |  ^  4€  +  6e- +  Ac^  +  e^  =  4e+ 0  (c").      By 
assumption  (l+g,i)a,<(l  +  ^,2)'r.i-i  so 

^i2'r,,-i  1  ^  1^,2 1-7- '^1  -  gii^' 

where 
Thus 

=  ((i+^,4)a,-'r,,^i)'" 

where 

|^,4|  s  3e^3e-  +  e2^(4e  +  6e=  +  4€^  +  e^)-^^^-^  =  7e  +  0(e-) 

as  desired.  □ 

Theorem  8:  Let  A  be  an  n  by  n  positive  definite  symmetric  tridiagonal  matrix,  L  its  computed 
Cholesky  factor  as  in  Proposition  10,  and  g  (e)  as  in  (10.1).  Let  ■/<!  be  the  scaled  diagonal 
dominance  of  A.  Let  <J\^  ■  ■  ■  ^cr„  be  the  singular  values  of  L  and  XjS  •  ■  •  ^X„  be  the  eigen- 
values of  A.  Then 

For  example,  when  e<.001,  the  upper  and  lower  bounds  are  bounded  by  1±[ — '■ ]. 

1-7 

Proof:  Combine  Corollary  1  of  Theorem  1,  Theorem  2  and  Proposition  10.  n 

It  is  easy  to  find  T  for  which  Algorithm  2  computes  all  the  eigenvalues  accurately,  but 
where  the  standard  QR  iteration  [17]  loses  all  accuracy  on  the  smallest  eigenvalues. 

Since  the  bidiagonal  SVD  algorithm  of  section  9  can  compute  the  singular  vectors  as 
accurately  as  the  "relative  gap"  error  bound  permits  (see  the  discussion  of  section  9),  Algo- 
rithm 2  will  compute  the  eigenvectors  of  T  as  accurately  as  Theorem  6  permits. 

Bisection  is  a  viable  algorithm  for  all  s.d.d.  tridiagonal  matrices.  A  similar  error 
analysis  to  the  one  for  bidiagonal  Sturm  sequences  shows  that  Sturm  sequences  can  find  accu- 
rate eigenvalues  of  7-57  where  57  causes  only  small  relative  perturbations  in  each  entry  of 
7  [14].  No  pivoting  is  required,  so  Sturm  sequence  evaluation  can  be  done  in  linear  time  with 
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no  storage  beyond  that  needed  for  T.  Together  with  Theorems  2  or  4,  this  implies  that  the 
computed  eigenvalues  all  have  high  relative  accuracy  if  the  matrix  is  symmetric  tridiagonal 
7-s.d.d. 

As  in  the  bidiagonal  case,  the  ability  of  divide  and  conquer  algorithms  [11]  to  achieve 
high  relative  accuracy  is  an  open  problem. 

11.  Algorithms  for  the  Dense  Symmetric  Eigenproblem 

First  we  present  a  new  algorithm  (or  rather  a  new  analysis  of  an  old  algorithm)  for 
finding  accurate  eigenvalues  of  (possibly  dense)  symmetric  s.d.d.  matrices.  The  algorithm  is 
based  on  bisection,  but  rather  than  computing  the  LDL^  factorization  of  H  -xl  and  using  the 
number  of  negative  D„  to  compute  the  number  of  eigenvalues  of  H  less  than  x,  we  compute 
the  LDL'^  factorization  oi  A  -xA"-,  where  /f  =  AAA. 

Second,  we  show  that  a  suitable  variation  of  inverse  iteration  can  compute  the  eigenvec- 
tors of  a  symmetric  s.d.d.  matrix  to  the  limiting  accuracy  of  the  "relative  gap"  error  bounds 
in  Theorems  6  and  7. 

Algorithm  3:  Stably  computing  the  inertia  of  a  shifted  y-scaled  diagonally  dominant  sym- 
metric matrix  H  —  xI: 

(0)  We  assume  as  before  that  A„=±l.  We  consider  only  a:>0;  for  x<0  consider 
-H  -xl. 

(1)  Permute  the  rows  and  columns  of  A  -xA~"  and  partition  it  as 

A  11  —  xA  f  *  A  12 

A  21  A22~^^2 

so  that  if  a  —xd~'  is  a  diagonal  entry  of  A  n  — xAf ",  then  either  a  =  —  loTa  =  l  and 
xd--^2. 

(2)  Compute  X  =  A2:-xAr-  — A2i(Aii— xAf-)~'Ai2. 

(3)  Compute  inertia(X)=  (n  ,  z  ,  p)  using  a  stable  pivoting  scheme  such  as  in  [4].  Here 
n  is  the  number  of  negative  eigenvalues,  z  the  number  of  zero  eigenvalues,  and  p 
the  number  of  positive  eigenvalues  of  X. 

(4)  The  inertia  oi  H  —xl  is  (n-i-dim(A  n)  ,  r  ,  p). 

Theorem  9:  Let  //  =  AAA  be  a  y -scaled  diagonally  dominant  symmetric  matrix  and  x>0  a  real 
scalar.  Algorithm  3  computes  the  exact  inertia  of  a  matrix  H  ^hH  —  xl,  where  5ff  =  A5AA, 
||5A  ||  =  0(e),  e  being  the  machine  precision.  Thus,  Algorithm  3  can  be  used  in  a  bisection  algo- 
rithm to  find  all  the  eigenvalues  of  H  to  high  relative  accuracy. 

Proof:  The  partitioning  guarantees  that  the  diagonal  entries  of  A^^-xA'-  are  all  less  than  or 
equal  to  -1.  Therefore,  all  the  eigenvalues  of  An— xAf"  are  less  than  or  equal  to  -I  +  7. 
Since  X  is  defined  so  that 


Aii-xA,  - 


An 
A^-.  — xA- 


Aii-xAf-    0 
0  X 


I    (A,,-xAr  =  )-'A,2 
0  / 


A2,(A„-xAr^)-' 
the  inertia  of  A  —  xA  ~"  is  equal  to 

inertia(A -xA~-)  =  inertia(X)  +  inertia(A  n -xAf -)  =  inertia(X)  +  (dim(A  ii),0,0) 

by  Sylvester's  Theorem.  The  algorithm  in  [4]  will  compute  the  exact  inertia  of  X  -r  SX,  where 

||5X||=0(£)||X||.  Thus  if  we  show  that   ||X||   is  of  order  I-7  <    IIA2:  ||  ^   1-r 7,  we  will  be 
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done. 


To    this    end    we    note    that    by    construction 

||Au||=||A2i  11^7-  Thus 


as  desired,  n 


In 


the 


case 


of 


l  +  -y  +  2  + 


pencils,        there 


xA- 


1-7 


IS 


=  2,    (Tm,„(/iii-J:Ai -)s:l--Y,     and 


1-7 


an        analogous        algorithm. 


If 


H-\M  =  AfjAH^H~^^M^M^M  is  a  7-s.d.d.  definite  pencil,  we  compute  the  LDL^  decompo- 
sition of  A/^-xA^'A^A^/A.^/A^'  in  order  to  count  the  number  of  eigenvalues  less  than  x. 
Henceforth  we  will  assume  without  loss  of  generality  that  H-\M  =  W-X.AAA  with 
|/f„|  =  A„  =   1. 


Algorithm   4:  Stably  computing  the  inertia  of  a  shifted  -y-scaled  diagonally  dominant  definite 
pencil  H  —  xM: 

(0)  As     stated     above,     we     assume     without    loss     of     generality     that    A/=AAA     with 
\fin  I  =  '^M  -  1-  We  also  consider  only  x  >0;  for  x  <0  consider  - H  -xM . 

(1)  Permute  the  rows  and  columns  of  H  -xAAA  and  partition  it  as 

//ii-xAf'AiiAr'    ffi2-xAr'Ai2A2~' 


//-.,-xA- 


"'A-,-,A-' 


//21  — XA2    A21A1 

so  that  ii  h—xd~-  is  a  diagonal  entry  of  Hn-xAf'A  nAf ',  then  either  h  =  -\oTh=l 
andx(i"^  ^  M-  =  2(l+7)/(l-7). 

(2)  Compute 

X  =  H22-xA2^A22^2^  -  (^:i-^A7'A2iAr')-(//ii-xAr'A„Ar')~>-(/f ,2-xAf'A,2A2'') 

(3)  Compute  inertia(X)  =  {n  ,  z  ,  p)  using  a  stable  pivoting  scheme  such  as  in  [4]. 

(4)  The  inertia  of  H  —xM  is  (n  +  dim(A  ,1)  ,  -  ,  p). 

Theorem  10:  Let  H  —  \M  =  AhAhAh~^^mAm^M  l'^  ^  'i-^-'^-^-  definite  pencil  and  x>0  a  real 
scalar.  Algorithm  4  computes  the  exact  inertia  of  H  +  hH  —  xM ,  where  8W  =  A^SAA^, 
||5A  II  =  0(e),  e  being  the  machine  precision.  Thus,  Algorithm  4  can  be  used  in  a  bisection 
algorithm  to  find  all  the  eigenvalues  of  H  —  \M  to  high  relative  accuracy. 

Proof:  Let  Y  =  7f  n  "^Af'A  nA  f' ,  and  define  A:  =  x"'A,/f  ,iA, -A  „,  so  that  Y  =  xAf'A'Af'. 
Now  if  \{K)  is  any  eigenvalue  of  K,  we  have 


HK)  S  X^„(x-'A,//„A,)  -  X„,„(A„)  S  ^L-Hl  +  7)  -   (1-7) 


klSL 


and   ||Ar    '||  s  2/(1-7).  Thus,  Y  is  also  nonsingular,  with  all  negative  eigenvalues.  Since  X 
and  Y  are  defined  so  that 


//ii-xAf'AiiAf'    //,2-xAr'A,;A2"'' 


H-'-L  -xAi  'A-.,  A 


I'^i 


//^i  — xA->    AtiA-i 


/  0 

(//2i-^A2"'A2iAr')y-'   / 


Y    0 
0   X 


I    y"'(//i2-xAr'Ai2A2"') 
0  / 


we  have 
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incnia-iH  -xM)  =  inertia(X)  +  inertia(r)  =  inertia(X)  -  (dim(A  ,i),0,0) 

by  Sylvester's  Theorem.  The  algorithm  in  [4]  will  compute  the  exact  inertia  of  X  -5X,  where 
||8X  II  =  0(e)||X  ||.  Thus  if  we  show  that  ||X  ||  is  of  order  l-7<  ||//22ll-l-7.wewillbe 
done.    To  this  end  we  write 

I|A:||  £    \\H22\\  -r    ||;tA2-'A::A2-'||  +    ||//2ii'"''f^i2  11  -    jjjrA.-'A:,  Af'^'/fn  || 
+    \\H2iY'^xAiUi2^z^\\  +    jjxA.-UoiAr'y-'jrAr'Ai.A."'!! 
=    11^22  II  +    |lxAf'A22A2"'ll  +    l|W2i^~'A,i(:-'A,//,2ll  -r    1 1 A2" 'A 2, AT " ' A ,/f  ,2  1 1 

+   ||H2iA,^-'Ai2A2"'|I  ^    ||xA2"'A2i/(:~'Ai:A2"''!i    ■ 

Using  the  facts  that  jJA  ,2  II  ^  7,  11^21  II  ^  7,  Ij-f^nll  ^  7,  l|//:i  II  ^  7,  I|a:~MI  ^  2/(1-7), 
||A,|H|A:-MI  ^  l,x-'||Ai||-  s  ^i    ',  andxIlA.  '11"  ^  i^,  we  get 

IIXll  s  1-7  -r  (l+7)f^  -r  7-(2/(l-7))^^''  +  7'(2/(l-7))  ^  7'(2/(l-7))  +  7-(2/(l- 7))tt 

<  14/(1-7)- 

as  desired.  □ 

Now  we  present  a  variation  on  inverse  iteration  which  can  compute  the  eigenvectors  of 
a  symmetric  s.d.d.  matrix  to  the  limiting  accuracy  of  the  "relative  gap"  error  bounds  of 
Theorems  6  and  7.  A  similar  algorithm  applies  to  pencils: 

Algorithm  5:  Inverse  iteration  for  computing  the  eigenvector  x  of  a  symmetric  s.d.d.  matrix 
H  =  AAA  corresponding  to  eigenvalue  z: 

(0)  We  assume  the  eigenvalue  z  has  been  computed  accurately  using  one  of  the  previous 

algorithms. 

(1)  Choose  a  unit  starting  vector  y^;  set  i  =  0. 

(2)  Compute  the  LDV  factorization  of  P^ [A  -zA~-)P,  where  P  is  the  same  permutation  as 
in  Algorithm  3. 

(3)  Repeat 

i  =  i^l 

Solve  (A —lA  "-)>',  =  y,-!  for  y,  using  the  LDL'' factorization  of  step  (2) 
r=l/||j,ll 
y,  =  r-y, 
until  (r  =   0(e)) 

(4)  x  =  A-iy, 

Theorem   11:    Suppose  Algorithm  5  terminates  with  x  as  the  computed  eigenvector  of  the  sym- 
metric  s.d.d.   matrix  H=AAA.    Then   there   is   a   diagonal   matrix  D    with   D,,  =   1—0(6),    €  = 
machine  precision,    and  a   matrix  hA.    ||5A  ||  =   0(6),   such   that  Dx  is  an   exact  eigenvector  of 
A  (A  —  5A)A.  Thus,  the  error  in  x  is  bounded  by  the  results  in  Theorems  6  and  7. 

Sketch  of  Proof:  Let  v,  be  the  computed  solution  of  (A  -rA  "-)>>,  =  y,  _]  at  the  last  iteration  of 
Algorithm  5.  Applying  the  error  analysis  of  the  proof  of  Theorem  4,  one  can  show  that  there 
is  a  diagonal  matrix  D,  D„=H-0(e),  and  an  £,  ||E||  =  0(e),  such  that 
D  (A  -;A~--£)Dy,  =  y,-\.  Applying  the  result  in  [2],  we  can  assume  E  is  symmetric.  Since 
Algorithm  5  guarantees  r  =  l/\\y,\\  =  0(e),  another  application  of  the  result  in  [2]  guaran- 
tees the  existence  of  a  symmetric  F,  \\F  \\  =  0(e),  such  that  (A  ~  zA~-  ~  E  -  F)Dy,  =  0.  Thus, 
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Dv,  is  an  exact  eigenvector  of  A  +  E  -r  F  —  \A    -  for  X  =  z,  and  Dx  =  DA' 
vector  of  A(A  *8A)A,  ||6A  ||  =    ||£+f  |1  =  0(e)  as  desired.  D 


>,  IS  an  exact  eigen- 


12.  Application  to  Differential  Operators 

Consider  the  n  by  n  second  central  difference  matrix 


H„  =  h-'- 


2     -1 
-1      • 


-1     2 


which  arises  from  discretizing   —d-fdx-   at  equally  spaced  grid  points  x,  =  ih,   l<isn.   One 


easily  computes  that  1  — 7  for  H„  is  1  — cos- 


1 


,  which  approaches  0  as  n-00.  This  is  to  be 


expected  since  H„  approximates  an  unbounded  operator,  and  so  has  a  wider  and  wider  range 
of  eigenvalues  as  n-00.  Since  the  diagonal  of  H„  is  constant,  nothing  is  gained  by  writing 
W„  =  AAA,  A„=  1,  and  our  perturbation  theory  merely  says  that  all  the  eigenvalues  are  at  least 
as  sensitive  as  the  smallest  one.  Our  theory  becomes  more  interesting  when  considering 
unevenly  spaced  grid  points  x,.  For  example,  let  h,  =  x,  —  x,-i,  and  suppose  ^i^i//!,=  (3  for  all  i; 
this  corresponds  to  a  uniformly  graded  grid.  Then  the  corresponding  H„(^)  has  diagonal 
entries  ranging  from  Zh^-p,'^  to  2Af-3~"""'.  One  can  show  7(3)  for  //„0)  satisfies 
7(P)  =  2(2+ p+ P"')~"-"y^7,  so  that  the  eigenvalues  of  H„(^)  are  always  at  least  as  accu- 
rately determined  as  the  eigenvalues  of  H„.  In  fact,  if  P^^l,  1  — ^(p)  is  bounded  away  from  0 
for  all  n. 
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13.  Summary  and  Future  Work 

We  have  shown  that  there  are  a  number  of  situations  where  tiny  eigenvalues  and  singu- 
lar values  can  be  determined  much  more  accurately  than  standard  perturbation  theorems  and 
numerical  algorithms  can  guarantee.  This  is  true  for  singular  values  of  bidiagonal  matrices, 
eigenvalues  of  symmetric  s.d.d.  matrices  and  eigenvalues  of  s.d.d.  definite  pencils.  In  addi- 
tion, eigenvectors  of  symmetric  s.d.d.  matrices  corresponding  to  relatively  isolated  eigen- 
values are  determined  accurately  by  the  data.  Open  questions  remain  in  the  perturbation 
theory  for  the  singular  vectors  of  bidiagonal  matrices  and  in  perturbation  theory  for  eigen- 
values of  s.d.d.  pencils  with  positive  and  negative  eigenvalues. 

The  following  is  a  tabular  summary  of  the  current  state  of  research  into  corresponding 
high  accuracy  algorithms.  We  consider  two  classes  of  algorithms  for  eigenvalues  and  singular 
values  (bisection  and  QR),  and  two  classes  of  algorithms  for  eigenvectors  and  singular  vec- 
tors (inverse  iteration  using  accurate  eigenvalues/singular  values,  and  QR).  If  an  algorithm 
was  presented  in  earlier  research,  a  reference  is  given;  otherwise  it  was  discussed  here  for  the 
first  time.  Conjectured  algorithms  are  also  indicated.  (Divide  and  conquer  is  another  tech- 
nique for  these  problems.  Since  its  ability  to  deliver  high  accuracy  has  not  been  proven  in 
any  of  the  cases  considered  in  this  paper,  it  qualifies  as  a  "conjectured"  algorithm  for  all  of 
them.) 

Bisection  based  algorithms  for  computing  eigenvalues  and  singular  values  to  high  relative  accu- 
racy: 

1)  Singular  values  of  bidiagonal  matrices  [9]. 

2)  Eigenvalues  of  symmetric  tridiagonal  s.d.d.  matrices  (Theorem  4  and  [14]). 

3)  Eigenvalues  of  not  necessarily  tridiagonal  symmetric  s.d.d.  matrices  (Algorithm  3). 

4)  Eigenvalues  of  s.d.d.  definite  pencils  (Algorithm  4). 

QR  based  algorithms  for  computing  eigenvalues  and  singular  values  to  high  relative  accuracy: 

1)  Singular  values  of  bidiagonal  matrices  [9]. 

2)  Eigenvalues  of  symmetric  positive  definite  tridiagonal  matrices  (Algorithm  2  and  [10]). 

3)  Eigenvalues  of  symmetric  indefinite  tridiagonal  scaled  diagonally  dominant  matrices:  no 
QR  based  algorithm  appears  to  work  in  general. 

Inverse  Iteration  based  algorithms  for  computing  eigenvectors  and  singular  vectors  accurately: 

1)       Eigenvectors  of  symmetric  s.d.d.  matrices  and  pencils  (Algorithm  5). 

QR  based  algorithms  for  computing  eigenvectors  and  singular  vectors  accurately: 

1)  Conjectured:  singular  vectors  of  bidiagonal  matrices  (Proposition  7  and  [6]). 

2)  Eigenvectors  of  symmetric  positive  definite  tridiagonal  s.d.d.  matrices  (Algorithm  2  and 
[6]).  (Conjectured:  the  finer  error  bounds  of  Theorem  7). 

3)  Eigenvectors  of  symmetric  indefinite  tridiagonal  s.d.d.  matrices:  since  the  eigenvalues 
apparently  cannot  be  computed  accurately,  neither  can  the  eigenvectors. 

In  summary,  various  numerical  algorithms  are  available  to  compute  eigenvalues  and 
eigenvectors,  and  singular  values  and  singular  vectors  with  high  accuracy.  Algorithm  2  for 
the  symmetric  positive  definite  tridiagonal  eigenproblem  will  be  incorporated  in  the 
LAPACK  linear  algebra  library  [8].  Not  all  algorithmic  questions  have  been  settled,  how- 
ever, and  these  will  be  the  subject  of  future  research. 
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